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TITLE: Three Dimensional Structure of a Sterile Alpha Motif Domain 
FIELD OF THE INVENTION 

The invention relates to the three dimensional structure of a sterile alpha motif (Sam) 
domain. The atomic coordinates that define the structure and any compounds bound to the structure 
enable the determination of homologues, the three dimensional structures of polypeptides with 
unknown structure, and the identification of modulators of a SAM domain. 
BACKGROUND OF THE INVENTION 

The Eph family of receptor tyrosine kinases have been implicated in Ihe control of axon 
guidance [Henkemeyer, 1996; Orioli, 1996 J, cell migration [Krult, 1997), patterning of the nervous 
system [Xu, 1996) and angiogenesis (Wang, 1998), and are activated by clustering into dhners or 
tetramers {Stem, 1 998). However, the cell-surface ligands for Eph receptors (ephrins) apparently lack 
an intrinsic ability to induce receptor oligomerization [Lackmann, 1997). Factors that influence 
receptor aggregation include the pre-clustcring of ephrins [Davis, 1994), the homotypic interaction 
between the extracellular domains of two receptor chains [Lackmann, 1998), and the binding of PDZ 
domain containing proteins to the receptor's C-terminus [Hock, 1998]. 

All Epb receptors have a Sterile Alpha Motif (SAM) domain within their cytoplasmic 
regions. The SAM domain was identified as a conserved sequence present in a small set of yeast 
sexual differentiation proteins referred to as the Sterile Alpha Mating factors [Ponting, 1995: Schuftz, 
1997). In ETS family transcription factors this sequence has also been termed the Pointed domain 
[Klambt, 1993). The domain is found in a variety of proteins, many of which contain catalytic 
domains or recognized protein interaction domains. SAM domains are almost always located at a 
protein's N- or C-terminus. A highly conserved SAM domain is located in the cytoplasmic region of 
Eph receptors {approx. 50 % identity over 14 family members), C-termmal to the catalytic domain 
and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998). 
Amongst receptor tyrosine kinases, the presence of a cytoplasmic module other than the protein 
kinase domain is unique to Eph receptors. 

The SAM domain can function as a protein interaction module through an ability to homo 
and hetero-dimerize with other SAM domains [Jousset, 1997; Peterson, 1997; Tu, 1997; Kyba, 1998). 
This dimerizing property elicits oncogenic activation of chimeric proteins arising from translocation 
of the SAM domain of TEL to coding regions of the 0PDGF receptor [Golub, 1994); Abl [Gohib, 
1996J, and JAK2 protein kinases [Lacronique, 1997] or the AMD transcription factor [GohnY 1995). 
A functional role in mediating homo and hetero-typic dimerization has been shown for SAM domains 
in the transcription factor TEL [Jousset, 1997), members of the pofycomb group of transcriptional 
repressors (RAE28, Scm and ph) [Peterson, 1997), the protein kinase Byr2p (Tu, 19971 and the a 
and 0 isoforms of the liprin scaffolding proteins [Serra-Pages, 1998). 
SUMMARY OF THE INVENTION 

Broadly stated, the present invention relates to the three-dimensional structure of one or 
more SAM domains. The three-dimensional structures may be complexed with one or more 
compounds. The defined boundaries and properties of the structures and any of the compounds bound 
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to it are pertinent to methods for determining the three-dimensional structures of polypeptides with 
unknown structure, and to methods that identify modulators of SAM domain function. These 
modulators arc potentially useful as therapeutics foT diseases, including (but not limited to) ceO 
proliferative diseases, such as cancer, angtogenesis, atherosclerosis, and arthritis, and diseases 
5 associated whh the nervous system. 

Broadly stated the present invention relates to a crystalline form of a polypeptide 
corresponding to one or more SAM domains, preferably one or more SAM domains of an Eph 
receptor, preferably ofEphA. The crystalline form may comprise one or more heavy metal atoms, or 
15 at least one compound. In a preferred embodiment, a unit cell of the crystalline form of the invention 

10 has dimensions of about a=b-= 77.14 ± .03 angstroms, c= 24.3 ± .04 angstroms. 

The invention also relates to a method of forming a crystalline form of the invention 
comprising 

(a) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
1 5 container under conditions suitable for crystallization. 

The invention also features a method of determining three dimensional structures of 
polypeptides with unknown structure comprising the step of applying the structural atomic 
coordinates of a crystalline form of one or more SAM domains of the invention. 

Methods are also provided for identifying a potential modulator of a SAM domain function 
20 preferably a SAM domain of an Eph receptor function by docking a computer representation of a 
structure of a compound with a computer representation of a structure of one or more SAM domains 
30 of * c invention preferably a SAM domain of an Eph receptor that is defined by the atomic structural 

coordinates described herein. In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with 
25 a computer representation of a selected she on a SAM domain, preferably a SAM 

35 domain of an Epb receptor,to obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of 
30 SAM domain function. 

In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a compound compiexed with a selected site 
on a SAM domain, preferably a SAM domain of an Eph receptor, by deleting or 
adding a chemical group or groups; 
35 (*>) determining a conformation of the complex with a favourable geometric fit and 

favourable complementary interactions; and 
(c) identifying a compound that best fits the selected she as a potential modulator of a 
SAM domain. 

50 In still another embodiment the method comprises the following steps: 



40 



45 



55 



10 



20 



25 



40 



WO 00/37500 PCT/CA99/01209 

-3 - 

(a) selecting a computer representation of a compound complexed with a selected site 
on a SAM domain, preferably a SAM domain of an Eph receptor, and 

(b) searching for molecules in a data base that axe similar to the compound using a 
searching computer program, or replacing portions of the compound with similar 
chemical structures from a data base using a compound building computer 
program. 

The invention also features a potential modulator of a function of a SAM domain preferably 
a SAM domain of an Eph receptor identified by the methods of the invention, and a method of 
15 trcatul g 8 discasc associated with a SAM domain preferably a SAM domain of an Eph receptor with 

1 0 inappropriate activity in a cellular organism, comprising: 

(a) administering a modulator identified using the methods of the invention in an 
acceptable pharmaceutical preparation; and 

(b) activating or inhibiting a SAM domain function to treat the disease. 
The invention also provides peptides that mediate SAM domain function. 

15 BRIEF DESC RIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 
Figure 1 A shows a sequence alignment of SAM domains from selected proteins (SEQ. ID. 
NOS. I to 21); 

Figure IB shows a selection of mufti-domain proteins containing SAM domain (S); 
Figure 2A is a ribbons depiction of the SAM homo-dim er viewed down the twofold 
symmetry axis; 

30 Fi 8 ure 2B is a ribbons depiction of the SAM homo-dimer viewed perpendicular to the 

symmetry axis; 

Figure 2C is a ribbons stereo view highlighting the dimer interface region; 
25 Figure 3A is a molecular surface and worm representation of the SAM homodimer; 

35 Fi 8 ure 38 is a molecular surface and worm representation of the SAM homodimer, and 

Figure 4 is a gel nitration elution profile of wild type and single or double site mutants of the 
EphA4 receptor SAM domain. 

DETAILED DESCRIPTION O F THE INVENTION 
30 DEFINITIONS: 

Unless otherwise indicated, all terms used herein have the same meaning as they would to 
one skilled in the art of the present invention. Practitioners are particularly directed to Current 
Protocols in Molecular Biology (Ansubel) for definitions and terms of the art 

Abbreviations for ammo acid residues are the standard 3- letter and/or 1 -letter codes used in 
35 m 10 refcr to one of the 20 common L-amino acids. Likewise abbreviations for nucleic acids are 
the standard codes used in the art. 

The term "crystalline form" in the context of the invention, is a crystal formed from an 
aqueous solution comprising a purified polypeptide comprising one or more SAM domains, 
50 preferably a SAM domain of an Eph receptor. A crystalline form of a SAM domain is characterized 
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as being capable of diffracting x-rays in a pattern defined by one of the crystal forms depicted in 
Bhmdel et aJ 1976, Protein Crystallography, Academic Press. A crystalline form may include a 
crystal structure in association with one or more heavy-metal atoms Le. a derivative crystal, or a 
crystal structure in association with one or more compounds i.e. a co-crystal. 

The term "association" refers to a condition of proximity between a chemical entity or 
compound 01 portions or fragments thereof, and a SAM domain or portions or fragments thereof. The 
association may be non-covalent i.e. where the juxtaposition is energetically favored by for example, 
hydrogen-bonding, van der Waals, or electrostatic or hydrophobic ineractions, or it may be covaJent 

The term "heavy-metal atoms" refers to an atom that is a transition element, a tanthanide 
metal, or an actinide metal Lanthanide metals include elements with atomic numbers between 57 and 
71, inclusive. Actinide metals include elements with atomic numbers between 89 and 103, inclusive. 

The terra "Eph receptor" refers to a subfamily of closely related transmembrane receptor 
tyrosine kinases related to Eph, a receptor named for its expression in an erythropoietm-producing 
human hepatocellular carcinomas cell line. The receptors contain cell adhesion-like domains on their 
extracellular surface. The Eph subfamily receptor tyrosine kinases are more specifically characterised 
as encoding a structurally related cysteine rich extracellular domain containing a single 
immunoglobulin (Ig>like loop near the N- terminus and two fibronectin III (FN III) repeats adjacent 
to the plasma membrane. The Eph receptors are divided into two groups based on the relatedness of 
their extracellular domain sequences. The grouping also corresponds to the ability of the receptors to 
bind preferentially to the ephrm-A or ephrin-B proteins. The group that includes receptors interacting 
preferentially with ephrin A proteins is called EphA and includes EphAI (also known as Eph and 
Esk), EphA 2 (also known as Eck, Myk2, Sek2), EphA3 (also known as Cek4, Mek4, Hek, Tyro4, 
Hek4), EphA4 (also known as Sek, Sekl, Cek8, Hek8, Tyrol), EphAS (also known as Ehkl, Bsk, 
Cek7, Hek7. and Rek7), EphA6 (Ehk2, and HekI2)EphA7 (also known as Mdkl, Hekl I, Ehk3, Ebk, 
Cekl J), and Eph A 8 (also known as Eek, Hek3). The group that includes receptors interacting 
preferentially with ephrin B proteins is called Eph B and includes EphBl (also known as Elk, Cek6, 
Net, Hek6), EphB2 (also known as Cek5, Nuk, Erk, Qek5, TyroS, Sek3, bek5, Drt), EphB3 (also 
known as CeklO, Hek2, Mdk5, Tyro6, and Sek4), EphB4 (also known as Hik, Mykl, Tyrol 1, Mdk2X 
EpbB5 (also known as Cek9, Hek9), and EphB6 (also known as Mep). 

"Ephrin* refers to a class of ligands which are anchored to the cell membrane through a 
transmembrane domain, and bind to the extracellular domain of an Eph receptor, facilitating 
diroerization and autophosphorylation of the receptor and autophosphorytattoo of the ligand The 
ephrins which are targeted in the methods of the invention are those that bind to and activate (i.e. 
phosphorylate) an EphA or an EphB receptor, preferably an EphA receptor. The ephrin-A ligands 
(GPl-anchored ligands) are ephrin-A (also known as B6I , LERK1, EFL-l), ephrin- A2 (also known as 
LERK6, Elfl, mCek7-L, cElfl), ephruWU (also known as LERK3, Ehkl-L, and EFL-2X ephrin- A4 
(also known as LERK4, EFL-4, mLERK4), ephrin- A5 (AL1, LERK7, EFL-5, mALl, [rLERK7], 
RAGS), and the ephrin-B ligands (transmembrane ligands) are ephrin-B 1 (also known as LEKR2, 
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ELK-L, EFL-3, Cek5-L, Stral, [LERK2J), ephrin-B2 (also known as LERK5, HTK-U NLERKI, 
Elf2. Hlk-LX and ephrin-B3 (also known as LERK8, ELK-L3, NLERK2, EFL-6, Elf3, [rELK-L3]). 

The term "SAM domain" refers to a region known as the Sterile Alpha Motif (SAM) domain 
within the cytoplasmic regions of all Eph receptors (Figure IB), and in other proteins such as TEL 
[Joussct, 1997 J, members of the polycomb group of transcriptional repressors (RAE2S, Son and ph) 
[Peterson, 1997], the protein kinase Byr2p [Tu, 1997], the a and 0 isoforms of the liprin scaffolding 
proteins [Serra-Pages, 1 998J, and tankyrasc (Smith, S. etal, Science 282: 1484-1487, 1998, Acession 
AF082556). The SAM domain was identified as a conserved sequence present in a small set of yeast 
15 sexual differentiation proteins referred to as the Sterile Alpha Mating factors [Pouting, 1995; Schultz, 

10 1997], In ETS family transcription factors this sequence has also been termed the Pointed domain 
(Klambt, 1993 J. Extensive database searching and sequence alignment analysts (Figure I A) reveals 
that this domain is found in a variety of proteins, many of which contain catalytic domains or 
20 recognized protein interaction domains (Figure IB). SAM domains are almost always located at a 

protein's N- or C-termmus. A highly conserved SAM domain is located in the cytoplasmic region of 
15 Eph receptors (approximately 50 % identity over 14 family members), C-terminal to the catalytic 
domain and followed by only 5 residues that form a potential PDZ domain binding site [Hock, 1998]. 
The term also includes amino acid sequences having substantial sequence identity to a SAM domain, 
a mutant, or a subunit of a SAM domain. Preferably the SAM domain is an "Eph SAM domain" i.e. 
a SAM domain of an Eph receptor. 
20 "SAM domain structure" or "SAM domain three dimensional structure" refers to the three 

dimensional structure of a purified polypeptide comprising one or more SAM domains, preferably a 
30 crystalline form. 

As applied to polypeptides, the term " substantial sequence identity" means that two peptide 
sequences, when optimally aligned, such as by the programs GAP or BESTFIT using default gap, 
25 share at least 80 percent sequence identity, preferably at least 90 percent sequence identity, more 
35 preferably at least 95 percent sequence identity or more. Preferably, residue positions which are not 

identical differ by conservative amino acid substitutions. For example, the substitution of amino acids 
having similar chemical properties such as charge or polarity are not likely to effect the properties of 
a protein. Examples include glutamine for asparagine or glutamic acid for aspartic acid. 

The term "mutant" refers to a polypeptide that is obtained by replacing at least one amino 
acid residue in a native SAM domain with a different amino acid residue. Mutation can be 
accomplished by adding and/or deleting amino acid residues within the native SAM domain. A 
mutant may or may not be functional. 

The term "function" refers to the ability of a modulator to enhance or inhibit the association 
45 35 between a SAM domain and a compound. 

The term "atomic structural coordinates" as used herein refers to a data set that defines the 
three dimensional structure of a molecule or molecules (e.g. unit cell axial lengths, space group). 
Structural coordinates can be slightly modified and still render nearly identical three dimensional 
50 structures. A measure of a unique set of structural comdinates is the root-mean-square deviation of 
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the resulting structure. Structural coordinates that render three dimensional structures that deviate 
from one another by a root-mean-squarc deviation of less than 1.5 A may be viewed by a person of 
ordinary skill m the art as identical- Structural coordinates for a SAM domain are in Table 2. 

The term "unit cell" refers to the smallest and simplest volume element (i.e. parallelpiped- 
5 shaped block) of a crystal that is completely representative of the unit of pattern of the crystal. The 
unit cell axial lengths are represented by a, b, and c where a « x axis, b » y axis, and c = z axis. Those 
of skill in the art understand that a set of atomic coordinates determined by X-ray crystallography is 
not without standard error. 

1 5 The term "space group" refers to the symmetry of a unit cell. In a space group designation 

10 the capital letter indicates the lattice type and the other symbols represent symmetry operations that 
can be carried out on the unit cell without changing its appearance. 

The term "purified" in reference to a polypeptide, does not require absolute purity such as a 
homogenous preparation rather it represents an indication that the sequence is relatively purer than in 
the natural environment. Generally, a purified polypeptide is substantially free of other proteins, 
15 lipids, carbohydrates, or other materials with which H is naturally associated, preferably at a 
functionally significant level for example at least 85% pure, more preferably at least 95% pure, most 
preferably at least 99% pure. A skilled artisan can purify a polypeptide comprising a SAM domain 
using standard techniques for protein purification. A substantially pure polypeptide comprising a Sam 
domain will yield a single major band on a non-reducing polyacrylamide gel. The purity of the SAM 
20 domain polypeptide can also be determined by ammo-terminal amino acid sequence analysis. 
Three Dimensional Structure of SAM Domain 
30 The present invention provides a purified SAM domain three dimensional structure. In an 

embodiment the structure is a crystalline form. A SAM domain structure may comprise one or more 
SAM domains in a unit cell, preferably two, three or four SAM domains. In a preferred embodiment, 
"25 a SAM domain is arranged in a crystalliine manner in a space group P6 4 so as to form a unit cell of 
35 dimensions a=b~ 77.14 angstroms, c= 24.37 angstroms and which effectively diffracts X-rays for 

determination of the atomic coordinates of the SAM domain to a resolution of about 2.9 angstroms. 
The 3-dimensionai structure of a preferred SAM domain of the invention is shown in Figures 2 and 3. 
A crystalline form includes.native crystals, derivative crystals, and co-crystals. The native 
30 crystals generally comprise substantially pure polypeptides comprising one or more SAM domains in 
crystalline form. It is understood that the crystalline form is not limited to naturally occurring or 
native SAM domains but includes mutants of native SAM domains obtained by replacing at least one 
amino acid residue in a native SAM domain with a different amino acid residue or by adding or 
deleting amino acid residues within the native polypeptide, and having substantially the same three 
45 35 dimensional structure as the native SAM domain from which the mutant is derived i.e. having a set of 

atomic structural coordinates that have a root mean square deviation of less than or equal to about 2 A 
when superimposed with the atomic structure coordinates of the native SAM domain from which the 
mutant is derived when at least 50% to 100% of the atoms of the native SAM domain are included in 
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the superimposition. h should be noted tbat the mutants contemplated herein need not exhibit SAM 
domain activity. 

The derivative crystals of the invention generally comprise a crystalline SAM domain in 
covaknt association with one or more heavy metal atoms. The SAM domain may correspond to a 
5 native or mutated SAM domain. Heavy metal atoms useful for providing derivative crystals include 
by way of example, and not limitation gold, mercury, etc. 

The invention features a crystalline form of a SAM domain in association with one or more 
compounds. The association may be covaleni or non-covalent. These types of crystalline forms are 
15 referred to herein as co-crystals. The compound may be any organic molecule, and it may modulate 

10 the function of a SAM domain by for example uihibiting or enhancing its function, or it may be an 
analogue of a SAM domain. It is preferred tbat the geometry of the compound and the interactions 
formed between the compound and the SAM domain provide high affinity binding between the two 
20 molecules. High affinity binding is preferably governed by a dissociation equilibrium constant on the 

order of 10* or less. 
1 5 Method for Preparing Crystal Forms of SAM Domain 

The invention also features a method for creating the crystalline SAM domain structures 
described herein. The method may utilize a polypeptide comprising a SAM domain described herein 
to form a crystal. A polypeptide used in the method may be chemically synthesized in whole or in 
part using techniques that are well-known in the art. Alternatively, methods are well known to the 
20 skilled artisan to construct expression vectors containing the native or mutated SAM domain coding 
sequence and appropriate transcriptionaVtranslational control signals. These methods include in vitro 
30 recombinant DNA techniques, synthetic techniques, and in vivo recombination/genetic 

recombination. See for example the techniques described m Sambrook et al. (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press ( 1 989)), and other laboratory 
25 textbooks. 

35 Crystals are grown from an aqueous solution containing the purified and concentrated SAM 

domain polypeptide by a variety of conventional processes. These processes include batch, liquid, 
bridge, dialysis, vapor diffusion, and hanging drop methods. (See for example, McPherson, 1982 
John Wiley, New York; McPherson, 1990, Eur. ). Biochem. 189: 1-23; Webber. 1991, Adv. Protein 
30 Chera. 41:1-36). Generally, the native crystals of the invention are grown by adding precipitants to 
the concentrated solution of the SAM domain polypeptide. The precipitants are added at a 
concentration just below tbat necessary to precipitate the protein. Water is removed by controlled 
evaporation to produce precipitating conditions, which are maintained until crystal growth ceases, 
m an embodiment of the invention, the method generally comprises the steps of 
45 35 . (a) mixing a volume of polypeptide solution with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed 
container, under conditions suitable for crystallization. 

For crystals of the invention, it has been found that hanging drops containing about lul of 
50 SAM domain polypeptide (50-150 mg/rol, preferably 100 mg/mL in 5-2-mM, preferably 7mM Hepes 
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pH 5.5 to 9, preferably 7.5) and equal volumes of reservoir butter (50-150 mM, preferably lOOmM 
cacodylatc pH 5.5 to 7.5, preferably 6.5; 5-10% preferably 7% (w/v) PEG $000; and 10-30^ 
preferably 20% (v/v) ethylene glycol) suspended overnight at room temperature provide crystals 
suitable for high resolution X-ray structure determination. It will be appreciated that the above- 
described crystallization conditions can be varied and such variations can be used alone or in 
combination. For example other buffer solutions such as Tris-HCL buffer may be used- 
Derivative crystals of the invention can be obtained by soaking native crystals in a solution 
containing salts of heavy metal atoms. Co-crystals of the invention can be obtained by soaking a 
native crystal in a solution containing a compound that binds the SAM domain, or they can be 
obtained by co-crystallizing the SAM domain polypeptide in the presence of one or more compounds 
that bind to the SAM domain. 

Once the crystal is grown it can be placed in a glass capillary tube and mounted onto a 
holding device connected to an X-ray generator and an X-ray detection device. Collection of X-ray 
diffraction patterns are well documented by those skilled in the art (See for example, Ducruix and 
Geige, 1992, IRL Press, Oxford, England). A beam of X-rays enter the crystal and diffract from the 
crystal. An X-ray detection device can be utilized to record the diffraction patterns emanating from 
the crystal. Suitable devices include the Marr 345 imaging plate detector system with an RU200 
rotating anode generator. 

Methods for obtaining the three dimensional structure of the crystalline form of a molecule 
or complex are described herein and known to those skilled in the art (see Ducruix and Geige, supra). 
Generally, the unit cell dimensions and orientation in the crystal can be determined from the spacing 
between the diffraction emissions as well as the patterns made from the emissions. The symmetry of 
the unit cell in the crystal is also determined Each diffraction pattern emission is characterized as a 
vector and the data collected at this stage determines the amplitude of each vector. The phases of the 
vectors may be determined by the isomorphous replacement method where heavy atoms soaked into 
the crystal are used as reference points in the X-ray analysts (see for example, Otwinowski, 1991, 
Daresbury, United Kingdom, 80-86). The phases of the vectors may aho be determined by molecular 
replacement (see for example, Naraza, 1994, Proteins 11:281-296). The amplitudes and phases of 
vectors from the crystalline form of an Eph SAM domain, preferably an EphA4 SAM domain, 
determined in accordance with these methods can be used to analyze other crystalline SAM domains. 

The unit cell dimensions and symmetry, and vector amplitude and phase information can be 
used in a Fourier transform function to calculate the electron density in the unit cell i.e. to generate an 
experimental electron density map. This may be accomplished used the PHASES package (Furey, 
1990). Amino acid sequence structures are fit to the experimental electron density map (ie. model 
building) using computer programs (e.g. Jones, TA. et a!. Acta Crystallogr A47, 100-1 19, 1991) to 
calculate a theoretical electron density map. The theoretical and experimental electron density maps 
can be compared and the agreement between the maps can be described by a parameter referred to as 
R-factor. A high degree of overlap in the maps is represented by a low value R-factor. The R-factor 
can be minimized by using computer programs that refine the theoretical electron density map. For 
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cx ample, the XPLOR program, developed by Brunger (1992, Nature 355:472-475) can be used for 
model refinement 

A three dimensional structure of tbe molecule or complex may be described by atoms mat fit 
the theoretical electron density characterized by a minimum R value. Files can be created for the 
5 structure that defines each atom by coordinates in three dimensions. 
Identification of Homologoes 

Tbe knowledge of the three dimensional structure of a SAM domain, in particular the 
EphA4 SAM domain, enables one skilled m the art to identity homologues. This is achieved by 
15 searches of three-dimensional databases. Since structural folds are conserved to a greater extent than 

10 sequence, one may identity homologues with very little sequence similarity. Programs that provide 
this type of database searching are known in the art and include Dali. The structural coordinates of a 
protein structure are submitted and the program performs a multiple structural alignment with 
20 proteins in the protein data bank. 

Methods for Determining Three Dimensional Structures 
15 The structure coordinates of a SAM domain structure described herein can be used as a 

model for determining the three dimensional structures of additional native or mutated SAM domains 
with unknown structure, as well as the structures of co-crystals of SAM domains with compounds 
such as modulators (e.g. agonists or antagonists). The structure coordinates and models of a SAM 
domain three dimensional structure can also be used to determine solution- based structures of native 
20 or mutant SAM domains. 

Three dimensional structure may be determined by applying the structural coordinates of a 
SAM domain structure to other data such as an amino acid sequence, X-ray crystal lographic 
diffraction data, or nuclear magnetic resonance (NMR) data. Homology modeling, molecular 
replacement, and nuclear magnetic resonance methods using these other data sets are described 
25 below. 

35 Homology modeling (also known as comparative modeling or knowledge-based modeling) 

methods develop a three dimensional model from a polypeptide sequence based on the structures of 
known proteins. In the present invention the method utilizes a computer representation of the three 
dimensional structure of a SAM domain, preferably the EphA SAM domain, more preferably the 
30 EphA4 SAM domain, or a complex of same, a computer representation of die amino acid sequence of 
a polypeptide with an unknown structure, and standard computer representations of the structures of 
amino acids. The method in particular comprises the steps of; (a) identifying structurally conserved 
and variable regions in the known structure; (b) aligning the amino acid sequences of the known 
structure and unknown structure (c) generating coordinates of main chain atoms and side chain 

45 

35 atoms in structurally conserved and variable regions of the unknown structure based on the 
coordinates of the known structure thereby obtaining a homology model; and (d) refining the 
homology model to obtain a three dimensional structure for the unknown structure. This method is 
well known to those skilled in the art (Greer, 1985, Sceince 228, 1055; Bundell et al 1988, Eur. J. 
50 Biochem. 172, 513; Knighton et aU 1992, Science 258:130-135. 
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http^bchcm.vtedu/courscs/modcling/horoologyJim). Computer programs that can be used in 
homology modeling are Quanta and the Homology module in the Insight 1J modeling package 
distributed by Molecular Simulations lnc, or MODELLER (Rockefeller University, 
wwwJucr.ac.nk/sinris-tcm/logicaJ/prg-modcUerJitmi> 

In step (a) of the homology modeling method, the known SAM domain structure (e.g. 
structure of the EphA4 SAM domain) is examined to identify the structurally conserved regions 
(SCRs) from which an average structure, or framework, can be constructed for these regions of the 
protein. Variable regions (VRsX in which known structures may differ in conformation, also must be 
identified. SCRs generally correspond to the elements of secondary structure, such as alpha-helices 
(the four a-hclkes in the EphA4 SAM domain) and beta-sheets, and to ligand- and substrate-bmding 
sites. The VRs usually fie on the surface of the proteins and form the loops where the main chain 
turns. 

Many methods are available for sequence alignment of known structures and unknown 
structure. Sequence alignments generally are based on the dynamic programming algorithm of 
Necdlemao and Wunsch [J. Mol. Biol. 48: 442-4 53„ 1970]. Current methods include FASTA, Smith- 
Waterman, and BLASTP, with the BLASTP method differing from the other two in not allowing 
gaps. Scoring of alignments typically involves construction of a 20x20 matrix in which identical 
amino acids and those of similar character (i.e.,. conservative substitutions) may be scored higher man 
those of different character. Substitution schemes which may be used to score alignments include the 
scoring matrices PAM (Dayhoffet al., Meth. Enrymol. 91 : 524-545, 1983X and BLOSUM (Henikoff 
and HenikofT, Proc. Nat Acad. Sci. USA 89: 109I5-'0919, 1992X and the matrices based on 
alignments derived from tliree-dimcnsional structures including that of Johnson and Overington (JO 
matrices) (J. Mol. Biol. 233: 716-738, 1993). 

Alignment based solely on sequence may be used, though other structural features also may 
be taken into account. In Quanta, multiple sequence alignment algorithms are available that may be 
used when aligning a sequence of the unknown with the known structures. Four scoring systems (i.e. 
sequence homology, secondary structure homology, residue accessibility homology, CA-CA distance 
homology) are available, each of which may be evaluated during an alignment so that relative 
statistical weights may be assigned. 

When generating coordinates for the unknown structure, main chain atoms and side chain 
atoms, both in SCRs and VRs need to be modeled. A variety of approaches known to those skilled in 
the an may be used to assign coordinates to the unknown. In particular, the coordinates of the roam 
chain atoms of SCRs will be transferred to the unknown structure. VRs correspond most often to the 
loops on the surface of the polypeptide and if a loop in the known structure is a good model for the 
unknown, then the main chain coordinates of the known structure may be copied. Side chain 
coordinates of SCRs and VRs are copied if the residue type in the unknown is identical to or very 
similar to that in the known structure. For other side chain coordinates, a side chain rotamcr library 
may be used to define the side chain coordinates. When a good model for a loop cannot be found 
fragment databases may be searched for loops in other proteins that may provide a suitable model for 
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the unknown. If desired, the loop may then be subjected to con formal ion a) searching to identify low 
energy conformers if desired. 

Once a homology model has been generated it should be analyzed to determine its 
correctness. A computer program available to assist in this analysis is the Protein Health module in 
5 Quanta which provides a variety of tests. Other programs that provide structure analysis along with 
output include PROCHECK and 3D-ProfiIer [Lathy R. et al, Nature 356: 83-85, 1 992; and Bowie, 
J.U. et 8l Science 253: 164-170, 1991). Once any irregularities have been resolved, the entire 
structure may be further refined. Refinement may consist of energy minimization with restraints, 
15 especially for the SCRs. Restraints may be gradually removed for subsequent minimizations. 

1 0 Molecular dynamics may also be applied in conjunction with energy minimization. 

Molecular replacement involves applying X-ray diffraction data of a known structure to the 
incomplete X-ray crystallographic data set of a polypeptide of unknown structure. The method can be 
used to define the phases describing the X-ray diffraction data of a polypeptide of unknown soructure 
when only the amplitudes are known. Commonly used computer software packages for molecular 
15 replacement are X-PLOR (Bnmger 1992, Nature 355: 472-475X AMoRE (Navaza, 1994, Acta 
Crystallogr. A50:157-163X the CCP4 package (Collaborative Computational Project, Number 4, 
"The CCP4 Suite: Programs for Protein Crystallography"*, Acta CrysL, Vol. D50, pp. 760-763, J 994), 
and the MERLOT package (PM J>. Fitzgerald, J. Appl. Cryst, VoL 21, pp. 273-278, 1988). It is 
preferable that the resulting structure not exhibit a root-mean-square deviation of more than 3 A. 
20 The objective of molecular replacement is to align positions of atoms fn me unit cell by 

matching electron diffraction data from two crystals. Molecular replacement computer programs 
30 generally involve the following steps: (1 ) determining the number of molecules in the unit cell and 

defining the angles between them; (2) rotating the diffraction data to define the orientation of the 
molecules m the unit cell; (3) translating the electron density in three dimensions to correctly position 
25 the molecules in the unit cell; (4) determining the amplitudes and phases of the X-ray diffraction data 
35 and calculating an R-factor calculated from the reference data set and from the new data wherein an 

R-factor between 30-50% indicates that the orientations of the atoms in the unit cell have been 
reasonably determined by the method; and (5) optionally decreasing the R-factor to about 20% by 
refining the new electron density map using iterative refinement techniques known to those skilled in 
30 the art 

In an embodiment of the invention, a method is provided for determining three dimensional 
structures of polypeptides with unknown structure by applying the structural coordinates of a SAM 
domain structure to an incomplete X-ray crystallographic data set for a polypeptide of unknown 
structure, and determining a low energy conformation of the resulting structure. 
4 ^ 35 The structural coordinates of a SAM domain structure may be applied to nuclear magnetic 

resonance (NMR) data to determine the three dimensional structures of polypeptides. (See for 
example, Wuthrich, 1986, John Wiley and Sons, New York: 176-199; Pflugrath et al, 1986, J. 
Molecular Biology 189: 383-386; Kline et al, 1986 J. Molecular Biology 189:377-382). While the 
50 secondary structure of a polypeptide may often be determined by NMR data, the spatial connections 
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bclween individual pieces of secondary structure are not as readily determined. The structural 
coordinates of a polypeptide defined by X-ray crystallography can guide the NMR spectroscopic to 
an understanding of the spatial interactions between secondary structural elements in a polypeptide of 
related structure. Information on spatial interactions between secondary structural elements can 
greatly simplify Nuclear Overbauser Effect (NOE) data from two-dimensional NMR experiments. In 
addition, applying the structural coordinates after the determination of secondary structure by NMR 
techniques simplifies the assignment of NOE's relating to particular ammo acids in the polypeptide 
sequence and does not greatly bias the NMR analysis of porypeptide structure. 
15 In an embodiment, the invention relates to a method of determining three dimensional 

1 0 structures of polypeptides with unknown structures by applying the structural coordinates of a SAM 
domain structure to nuclear magnetic resonance (NMR) data of the unknown structure. This method 
comprises the steps of: (a) determining the secondary structure of an unknown structure using NMR 
data; and (b) simplifying the assignment of through-space interactions of amino acids. The term " 
through-space interactions" defines the orientation of the secondary structural elements in the three 
15 dimensional structure and the distances between amino acids from different portions of the amino 
acid sequence. The term "assignment" defines a method of analyzing NMR data and identifying 
which amino acids give rise lo signals in the NMR spectrum. 
Identification of Potential Modulators of SAM Domains 

Modulators of a SAM domain may be designed and identified that may modify the 
inappropriate activity of a SAM domain involved in a clinical disorder. The rational design and 
identification of modulators of SAM domains can be accomplished by utilizing the atomic structural 
30 coordinates that define a SAM domain's three dimensional structure. 

Modulators may include substances thai bind to or mimic the residues of a SAM domain that 
are required for dimerizalion of SAM domains. For example, a substance that binds to or mimics the 
25 interface residues of an EphA SAM domain (e.g. Val 913, Val 914, Met 972, Met 976, Met 979, VbI 
35 944, and Leu 940), or the proximal residues of an EphA SAM domain (e.g. lie 959 to Lys) may 

modify inappropriate activity of a SAM domain involved in a clinical disorder. 

Structure-based modulator design identification methods are powerful techniques that can 
involve searches of computer databases containing a variety of potential modulators and chemical 
functional groups. (See Kuritz et aL, 1994, Acc. Cbem. Res. 27:1 17; Guida, 1994, Current Camion in 
Struc. Biol. 4: 777; and Cohnan, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of 
structure-based drug design and identification;and Kuntzet af 1982, J. Mol. Biol. 162269; Kuntz et 
al. t 1994, Acc. Cbem. Res. 27: 117; Mcng et al., 1992, J. CompL Chan. 13: 505; Bohm, 1994, J. 
Comp. Aided Molec. Design 8: 623 for methods of structure-based modulator design). 
45 35 The SAM domain three dimensional structure described herein, and the three dimensional 

structures of other polypeptides determined by the homology modeling, molecular replacement, and 
NMR techniques described herein can also be applied to modulator design and identification 
methods. 
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Modulators of SAM domains may be identified by docking the computer representation of 
compounds from a database of molecules. Databases which may be used include A CD (Molecular 
Designs Limited), NCI (National Cancer Institute), CCDC (Cambridge Crystallography Data 
Center), CAST (Chemical Abstract Service), Derwent (Derwent Information Limited), Maybridge 
(Maybridge Chemical Company Ltd% Aldrich (Aldrich Chemical Company), DOCK (University of 
California in San Francisco), and the Directory of Natural Products (Chapman & Hall). Computer 
programs such as CONCORD (Tripos Associates) or DB-Converter (Molecular Simulations Limited) 
can be used to convert a data set represented in two dimensions to one represented in three 
dimensions. 

Generally, the computer programs comprise the following steps: 

(a) docking the structure of a compound into an active-site of a polypeptide (e.g.. EphA4 
SAM domain) using the computer program, or by interactively moving the compound 
into the active-site; 

(b) characterizing the geometry and the complementary interactions formed between the 
atoms of the active-site and the compound; and optionally 

(c) searching libraries for molecular fragments which can fit. into the empty space between 
the compound and active site and can be linked to the compound; and 

(d) linking the fragments found in (c) to the compound and evaluating the new modified 
compound. 

"Docking'" refers to a process of placing a compound in close proximity with an active site 
of a polypeptide (e.g.. an Eph SAM domain), or a process of finding low energy conformations of a 
compound/polypeptide complex (e.g. compound/Eph SAM domain). 

Examples of other computer programs that may be used for structure- based modulator 
design are CAVEAT (Bartlett et al., 1989, in "Chemical and Biological Problems in Molecular 
Recognition", Roberts, S.M. Ley, S.V.; Campbell, N.M. eds; Royal Society of Chemistry: 
Cambridge, pp 182-196); FLOG (Miller et aL, 1994, I. Comp. Aided Molec. Design 8:153); PRO 
Modulator (Clark et al., 1995 J. Comp. Aided Molec. Design 9:13); MCSS (Miranker and Karplus, 
1991, Proteins: Structure, Fuction, and Genetics 8:195); and, GRID (Goodford, 1985, J. Med. Chem. 
28:849). 

In an embodiment of the invention, a method is provided for identifying potential 
modulators of SAM domain function. 'Jhe method utilizes the structural coordinates of a SAM 
domain three dimensional structure. The method comprises the steps of (a) removing a computer 
representation of a SAM domain structure, preferably an Eph SAM domain structure, more 
preferably an EpbA4 SAM domain structure, and docking a computer representation of a compound 
from a computer data base whh a computer representation of the active site of the SAM domain; (b) 
determining a conformation of the complex with a favourable geometric fit or favorable 
complementary interactions; and (c) identifying compounds that best fit the SAM domain active-she 
as potential modulators of SAM domain function. The initial SAM domain structure may or may not 
have compounds bound to it A favourable geometric fit occurs when the surface areas of a 
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compound in a compound- SAM domain complex is in close proximity with the surface area of the 
active-site of the SAM domain without forming unfavorable interactions. A favourable 
complementary interaction occurs where a compound in a compound- SAM domain complex interacts 
by hydrophobic, aromatic, ionic, or hydrogen donating and accepting forces, with the active-site of a 
SAM domain without forming unfavorable interactions. Unfavourable interactions may be steric 
hindrance between atoms m the compound and atoms in the SAM active-site. 

In another embodiment, potential modulators are identified utilizing a three dimensional 
structure of a SAM domain with or without compounds bound to it. The method comprises the steps 
of (a) modifying a computer representation of a SAM domain (e.g. an Eph SAM domain) having one 
or more compounds bound to it, where the computer representations of the compound or compounds 
and SAM domain are defined by atomic structural coordinates; (b) determining a conformation of the 
complex with a favorable geometric fn and favorable complementary interactions; and (c) identifying 
the compounds that best fit the SAM active site as potential modulators. A computer representation 
may be modified by deleting or adding a chemical group or groups. Computer representations of the 
chemical groups can be selected from a computer database. 

Another way of identifying potential modulators is to modify an existing modulator in. the 
polypeptide active-site. The computer representation of modulators can be modified within the 
computer representation of a SAM domain active-site. This technique is described in detail in 
Molecular Simulations User Manual, 1995 in LUDL The computer representation of a modulator 
may be modified by deleting a chemical group or groups, or by adding a chemical group or groups. 
After each modification to a compound, the atoms of the modified compound and active-site can be 
shifted in conformation and the distance between the modulator and the active site atoms may be 
scored on the basis of geometric fit and favourable complementary interactions between the 
molecules. Compounds with favourable scores are potential modulators. 

Compounds designed by modulator building or modulator searching computer programs 
may be screened to identify potential modulators. Examples of such computer programs include 
programs in the Molecular Simulations Package (Catalyst), 1S1S/HOST, 1S1S/BASE, and 
ISIS/DRAW (Molecular Designs Limited), and UNITY (Tripos Associates). A building program may 
be used to replace computer representations of chemical groups in a compound complex ed with a 
SAM domain with groups from a computer data base. A searching program may be used to search 
computer representations of compounds from a computer database that have similar three 
dimensional structures and similar chemical groups as a compound that binds to a SAM domain. The 
programs may be operated on the structure of the active-site of the three dimensional structure of an 
Eph SAM domain, preferably an EphA4 SAM domain. 

A typical program may comprise the following steps: 

(a) mapping chemical features of the compound such as by hydrogen bond donors or 
acceptors, hydrophobic/lipophilic sites, positively ionizable sites, or negatively 
ionizable sites; 

(b) adding geometric constraints to selected mapped features; 
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(c) searching data basts with the model generated in {by 
In an embodiment of the invention a method of identifying potential modulators of a SAM 
domain, preferably an Eph SAM domain, more preferably an EphA SAM domain, is provided using 
me three dimensional conformation of the SAM domain in various modulator construction or 
5 modulator searching computer programs on compounds coroplcxcd with the SAM domain. The 
method comprises the steps of (a) removing a computer representation of one or more compounds 
complexed with a SAM domain; (b) (i) searching a data base for a compound with a similar 
geometric structure or similar chemical groups to the removed compounds using a computer program 
15 that searches computer representations of compounds from a database that have similar three 

10 dimensional structures and similar chemical groups or (ii) replacing portions of the compounds 
complexed with the SAM domain with similar chemical structures (Le. nearly identical shape and 
volume) from a database using a compound construction computer program that replaces computer 
20 representations of chemical groups with groups from a computer database, where the representations 

of the compounds are defined by structural coordinates. 
1 5 Potential modulators of SAM domains identified using the above-described methods may be 

prepared using methods described in standard reference sources utilized by those skilled in the art 
For example, organic compounds may be prepared by organic synthetic methods described in 
references such as March, 1994 Advanced Organic Chemistry: Reactions, Mechanisms, and 
Structure, New York, McGraw Hill. 
20 Cellular assays, as well as animal model assays in v/vo, may be used to test the activity of a 

potential modulator of a SAM domain as well as diagnose a disease associated with inappropriate 
30 SAM domain activity, in vivo assays are also useful for testing the btoactiviry of a potential 

modulator designed by the methods of the invention. 

The invention also relates to a potential modulator identified by the methods of tbe 
25 invention. 
35 Peptides 

The invention provides peptide molecules that modulate SAM domain function. The 
molecules arc derived from the interface residues necessary for dimer formation. For example, 
peptides of the invention include the amino acids Val 913, Val 914, Met 972, Met 976, Met 979, Val 

4Q 30 944, and Leu 940 of the EphA4 SAM domain. Other proteins containing sequences corresponding to 

the sequences necessary for dimer formation of a SAM domain may be identified with a protein 
homology search, for example by searching available databases such as GenBank or SwissProt and 
various search algorithms and/or programs may be used including FASTA, BLAST (available as a 
part of the GCG sequence analysis package, University of Wisconsin, Madison, Wis.), or ENTREZ 

4 35 (National Center for Biotechnology Information, National Library of Medicine, National Institutes of 

Health, Bethesda, MD). 

In accordance with an embodiment of the invention, specific peptides are contemplated that 
mediate SAM domain function comprising VVSV (SEQ ID. NO. 21), SAWSV (SEQ ID. NO.22), 

50 FSAW (SEQ ID. NO.23 \ FSAVVSV (SEQ ID. NO. 24), FSAVVSVGD (SEQ ID. NO. 25), 
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VVSVGDWL (SEQ ID. NO. 26), FNTV (SEQ ID. NO. 27% FNTVDE (SEQ ID. NO. 28), 
FNTVDEWL (SEQ ID. NO. 29% TSFNTVDEWL (SEQ ID. NO. 30), TSFNTV (SEQ ID. NO. 31), 
YTSFNTV (SEQ ID. NO. 32), RSEV (SEQ ID. NO. 33), RSEVLG (SEQ ID. NO. 34), RSEVLG WD 
(SEQ ID. NO. 35), VPFRSEV (SEQ ID. NO. 36), and VPFRSEVLG W (SEQ ID. NO. 37). 
5 In accordance with another embodiment of the invention, specific peptides are contemplated 

that mediate SAM domain function. In particular, a peptide of the formula I is provided which 
mediates SAM domain function: 

*5 X-X'-X'-X'-X'-X 3 ^* 1 

10 

wherein X and X* represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 1 represents Leu, Pbe, Asp, Ala, Glu, or Gly, preferably Leu or Gfy, X 2 represents Ghi, 
20 Asp, Ser, He, Ala, Arg, Lys, and Gin, preferably Glu or Asp, X 5 represents Ala, VaL Glu, Phe, Ser, 

He, Met, Leu, His, Gin, Arg, or Asp preferably Ala, VaL or Phe, X 4 is VaL Leu, Met, Phe, and He, 
15 preferably Val or Leu, or Phe, X 5 is Val, Ser, Leu, Asp, Ala, Pro, Asn, Lys, or Cys, preferably Val or 
Ser. 

In an embodiment of the present invention a peptide of the formula 1 is provided: 
wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 38), AAGYTT (SEQ ID. NO. 39), 
FTAAGVTT (SEQ ID. NO. 40% DNFTAAGYTT (SEQ ID. NO. 41), or YKDNFTAAGYTT (SEQ 
20 ID. NO. 42). In another embodiment X* represents HM, HMSQ (SEQ ID. NO. 43% HMSQD (SEQ 
ID. NO. 44), HMSQDD (SEQ ID. NO. 45% HMSQDDLA (SEQ ID. NO. 46), QMMM (SEQ ID. NO. 
30 47), QMMMED (SEQ ID. NO. 48% QMMMEDLL (SEQ ID. NO. 49), DITE (SEQ ID. NO. 50% 

DITEED (SEQ ID. NO. 51), DITEEDL (SEQ ID. NO. 52% NLTE (SEQ ID. NO. 53% NLTEND 
(SEQ ID. NO. 54% NLTENDI (SEQ ID. NO. 55). 
25 Preferred peptides of the formula I include the following: X-LEA W-X 6 , X-FDWS-X*, X- 

35 LEFLS-X 6 , X-GARFL-X*, LEA W (SEQ ID. NO. 56% TTLEAW (SEQ ID. NO. 57% LEAWHM 

(SEQ ID. NO. 58% LEAVVHMSQ (SEQ ID. NO. 59), LEAVVHMSQD (SEQ ID. NO. 60% 
LEAVVHMSQDDL (SEQ ID. NO. 61% LEAWHM SQDDLAR (SEQ ID. NO. 62% 
TTLEAWHMS (SEQ ID. NO. 63), TTLEA VVHMSQD (SEQ ID. NO. 64% TTLEAVVHMSQDDL 
4Q 30 (SEQ ID. NO. 65% TTLEA WHMSQDDLAR (SEQ ID. NO. 66), GYTTLEA VV (SEQ ID. NO. 67% 

GYTTLEAWHMS (SEQ ID. NO. 68% GYTTLEA WHMSQD (SEQ ID. NO. 69% 
GYTTLEA WHMSQDDL (SEQ ID. NO. 70% GYTTLEA WHMSQDDLAR (SEQ ID. NO. 71% 
FDVVS (SEQ ID. NO. 72), FDWSQ (SEQ ID. NO. 73% FDWSQMM (SEQ ID. NO. 74% 
FDWSQMMME (SEQ ID. NO. 75% FDWSQMMMED1L (SEQ ID. NO. 76% TSFDVVS (SEQ ID. 
45 35 NO. 77), TSFDWSQ (SEQ ID. NO. 78% TSFDVVSQMM (SEQ ID. NO. 79% TSFDWSQMMME 

(SEQ ID. NO. 80% TSFDWSQMMMEDIL (SEQ ID. NO. 81% LEFLS (SEQ ID. NO. 82% LEFLSD 
(SEQ ID. NO. 83% LEFLSDIT (SEQ ID. NO. 84% LEFLS DITEE (SEQ ID. NO. 85), 
LEFLSDITEEDL (SEQ ID. NO. 86), DDLEFLS (SEQ ID. NO. 87), GWDDLEFLS (SEQ ID. NO. 
50 88% DDLEFLSD (SEQ ID. NO. 89% DDLEFLSDIT (SEQ ID, NO. 90% DDLEFI^DITEE (SEQ ID. 
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NO. 91). DDLEFLSDITEEDL (SEQ ID. NO. 92), GARFL (SEQ ID. NO. 93), GARFLN (SEQ ID. 
NO. 94), GARFLNLT (SEQ ID. NO. 95), GARFLNLTEN (SEQ ID. NO. 96), and 1DGARFL (SEQ 
ID. NO. 97). 

In accordance with another embodiment of the invention, specific peptides are contemplated 
that mediate SAM domain function. In particular, a peptide of the formula II b provided which 
mediates SAM domain function: 

X^X^X ,0 -X ,, -X I2 -X U -X M -X ,5 -X I< H 



1 0 wherein X 7 and X " represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 amino 
acids, and X 1 represents Met, lie, Ser, Leu, Asn, Phe, or Val, preferably Met, X 9 represents Arg, Ser, 
Lys, Met, Leu, GIu, Gin, or Asn, preferably Gin or Arg, X 10 represents Thr, Ala, Arg, Leu, Ser, Ghi, 
20 ASP' Mct » Lvs . Gm » or Gly, preferably Thr, Ala, or GIu, X n represents Gin, Ser, GIu, Leu, Phe, Asp, 

Thr. Arg, preferably Gm or Arg, X IJ represents Met, Ala, He, Asn, Ser, Arg, Thr, Pro, Leu, Gm, VaL 
1 5 Lys, preferably Met or Arg, X 15 represents Gin, Asn, Pro, Ser, Tyr, Gm, Leu, Arg, or Lys, preferably 
Gin, Asn, or Arg, X w represents Gin, Ala, Pro, Asp, Leu, Lys, He, GIu, Arg, or Asn, preferably Gin 
or He, and X 15 represeDts Met, lie, Val, His, Ser, Arg, Lys, Phe, Cys, GIu, Tyr, Ala, He, Trp, or Leu. 

In an embodiment of the present invention a peptide of the formula 11 is provided: 
wherein X 7 represents QA, QV, NK, SVQA (SEQ ID. NO. 98), LSSVQA (SEQ ID. NO. 99), 
20 1LSSVQA (SEQ ID. NO. I0OX NKILSSVQA (SEQ ID. NO. 101), HQNKILSSVQA (SEQ ID. NO. 
102), THQNKILSSVQA (SEQ ID. NO. 103), ENIK (SEQ ID. NO. 104), SQEINK (SEQ ID. NO. 
30 105), KLSQEINK (SEQ ID. NO. 106), ELNS1QV (SEQ ID. NO. 107), or NSIQV (SEQ ID. NO. 

108). In another embodiment X 7 is HG, OS, HGRM (SEQ ID. NO. 109), HGRMVP (SEQ ID. NO. 
1 10), QSVEV (SEQ ID. NO. 1 1 IX or TRKP (SEQ ID. NO. 1 12). 
25 Preferred peptides of the formula U include the following: X 7 -MRTQMQQM- X '*, X 7 - 

35 MRAQMNQI-X 16 , X'-NEERRSIF-X 16 , MRTQMQQM (SEQ ID. NO. 113), QAMRTQMQQM 

(SEQ ID. NO. 1 14), SVQAMRTQMQQM (SEQ ID. NO. 1 15X LSSVQAMRTQMQQM (SEQ ID. 
NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 117), MRTQMQQMHG (SEQ ID. NO. 118), 
MRTQMQQMHGRM (SEQ ID. NO. 119), MRTQMQQMHGRMVPV (SEQ ID. NO. 120), 
30 NEERRSIF (SEQ ID. NO. 121), 1NKNEERRSIF (SEQ ID. NO. 122), NEERRSIFTRKP (SEQ ID. 
NO. 123). MRAQMNQI (SEQ ID. NO. 124), MRAQMNQIQS (SEQ ID. NO. 125), 
MRAQMNQIQSVEV (SEQ ID. NO. 126). 

All of the peptides of the invention, as well as molecules substantially homologous, 
complementary or otherwise functionally or structurally equivalent to these peptides may be used for 
35 purposes of the present invention. In addition to full-length peptides of the invention, truncations of 
the peptides are contemplated in the present invention. Truncated peptides may comprise peptides of 
about 7 to 10 amino acid residues 

The truncated peptides may have an amino group (-NH2), a hydrophobic group (for 
50 example, carbobenzoxyl, dansyL or T-butyloxycarbonyl), an acetyl group, a 9-fhiorenylmethoxy- 
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carbooyl (PMOC) group, or a roacromoiecule including but noi limited to Iipid-fatty acid conjugates, 
pory ethylene glycol, or carbohydrates at the amino terminal end. The truncated peptides may have a 
carboxyl group, an a mi do group, a T-butyloxycarbony) group, or a macromolecute including but not 
limited to Iipid-fatty acid conjugates, polyethylene glycol, or carbohydrates at the carboxy terminal 
5 end. 

The peptides of the invention may also include analogs of a peptide of the invention and/or 
truncations of the peptide, which may include, but are not limited to a peptide of the invention 
containing one or more amino acid insertions, additions, or deletions or both. Analogs of the peptide 
15 of the invention exhibit the activity characteristic of the peptide e.g. interference with SAM domain 

10 dimer formation, and may further possess additional advantageous features such as increased 
bioavailability, stability, or reduced host immune recognition. One or more amino acid insertions may 
be introduced into a peptide of the invention. Amino acid insertions may consist of a single amino 
20 ac id residue or sequential amino acids. 

One or more amino acids, preferably one to five amino acids, may be added to the right or 
1 5 left termini of a peptide of the invention. Deletions may consist of the removal of one or more amino 
acids, or discrete portions from the peptide sequence. The deleted amino acids may or may not be 
contiguous. The lower limit length of the resulting analog with a deletion mutation is about 7 amino 
acids. 

The invention also includes a peptide conjugated with a selected protein, or a selectable 
20 marker (see below) to produce fusion proteins. 

The peptides of the invention may be prepared using recombinant DMA methods. 
Accordingly, nucleic acid molecules which encode a peptide of the invention may be incorporated in ■ 
a known manner into an appropriate expression vector which ensures good expression of the peptide. 
Possible expression vectors include but are not limited to cosmids, plasm ids* or modified viruses so 
25 long as the vector is compatible with the host cell used. The expression vectors contain a nucleic acid 
35 molecule encoding a peptide of the invention and the necessary regulatory sequences for the 

transcription and translation of the inserted protein-sequence. Suitable regulatory sequences may be 
obtained from a variety of sources, including bacterial, fungal, viral, mammalian, or insect genes. 
(For example, see the regulatory sequences described in Goeddel, Gene Expression Technology: 
30 Methods in Enzyroology 185, Academic Press, San Diego, CA (1990). Selection of appropriate 
regulatory sequences is dependent on the host cell chosen, and may be readily accomplished by one 
of ordinary skill in the art. Other sequences, such as an origin of replication, additional DMA 
restriction sites, enhancers, and sequences conferring inducibilrty of transcription may also be 
incorporated into the expression vector. 
45 35 The recombinant expression vectors may also contain a selectable marker gene which 

facilitates the selection of transformed or transfected host cells. Suitable selectable marker genes are 
genes encoding proteins such as G4I8 and hygromycin which confer resistance to certain drugs, 
galactosidase, chloramphenicol acetyltransferase. firefly luciferase, or an immunoglobulin or portion 
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thereof such as the Fc portion of an immunoglobulin preferably IgG. The selectable markers may be 
introduced on a separate vector from the nucleic acid of interest. 

The recombinant expression vectors may also contain genes that encode a fusion portion 
which provides increased expression of the recombinant peptide; increased solubility of the 
recombinant peptide; and/or aid in the purification of the recombinant peptide by acting as a ligand in 
affinity purification. For example, a proteolytic cleavage site may be inserted in the recombinant 
peptide to allow separation of the recombinant peptide from the fusion portion after purification of 
the fusion protein. Examples of fusion expression vectors include pGEX (Amrad Corp., Melbourne, 
Australia), pMAL (New England Biolabs, Beverly, MA) and pRIT5 (Pharmacia, Piscataway, NJ) 
which fuse glutathione S- transferase (GST), maltose E binding protein, or protein A, respectively, to 
the recombinant protein. 

Recombinant expression vectors may be introduced into host cells to produce a transform ant 
host cell. Transform ant host cells include prokaryotic and eukaryotic cells which have been 
transformed or transfected with a recombinant expression vector of the invention. The terms 
'transformed with", "transfected with", "transformation'' and "trarofectioo" are intended to include 
the introduction of nucleic acid (e.g. a vector) into a cell by one of many techniques known in the art. 
For example, prokaryotic celb can be transformed with nucleic acid by electroporation or calcium- 
chloride mediated transformation. Nucleic acid can be introduced into mammalian cells using 
conventional techniques such as calcium phosphate or calcium chloride co-precipitation, DEAE- 
dextran-mediated transfectkm, lipofectm, electroporation or microinjection. Suitable methods for 
transforming and transfecting host cells may be found in Sam brook et al. (Molecular Cloning: A 
Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory press (1989)), and other laboratory 
textbooks. 

Suitable host cells include a wide variety of prokaryotic and eukaryotic host celb. For 
example, the peptides of the invention may be expressed in bacterial cells such as £L coli, insect cells 
(using baculo virus), yeast cells or mammalian cells. Other suitable host cells can be found m 
Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
CA(199I). 

The peptides of the invention may be tyrosine phosphorylated using the method described in 
Reedijk et a). (The EMBO Journal 1 1(4): 1365, 1992). For exampte/ryrosrhe phosphorylation may be 
induced by infecting bacteria harbouring a plasmid containing a nucleotide sequence encoding a 
peptide of the invention, with a Xgtl 1 bacteriophage encoding the cytoplasmic domain of the Elk 
tyrosine kinase as a LacZ-Elk fusion. Bacteria containing the plasmid and bacteriophage as a rysogen 
are isolated. Following induction of the rysogen, the expressed peptide becomes phosphorylated by 
the Elk tyrosine kinase. 

The peptides of the invention may be synthesized by conventional techniques. For example, 
the peptides may be synthesized by chemical synthesis using solid phase peptide synthesis. These 
methods employ either solid or solution phase synthesis methods (see for example, J. M. Stewart, and 
J J). Young, Solid Phase Peptide Synthesis, 2*"* Ed., Pierce Chemical Co., Rock ford 111. (1984) and G. 
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Barany and R.B. Merrifield, The Peptides: Analysis Synthesis, Biology editors E. Cross and J. 
Metenhofer Vol. 2 Academic Press, New York, 1980, pp. 3-254 for solid phase synthesis techniques; 
and M Bodansky, Principles fo Peptide Synthesis, Springer- Verlag, Berlin 1984, and E. Gross and J. 
Meienhofer, Eds., The Peptides: Analysis, Synthesis, Biologu, suprs. Vol 1, for classical solution 
synthesis). By way of example, the peptides may be synthesized using 9-fluorenyl methoxycarbonyt 
(Fmoc) solid phase chemistry with direct incorporation of phosphotyrosine as the N- 
fluorenylroethoxy-carbonyl-O-dimethyl phospbono-L-tyrosine derivative. 

N-terminal or C-tenninaJ fusion proteins comprising a peptide of the invention conjugated 
with other molecules may be prepared by fusing, through recombinant techniques, the N-terminal or 
C-terminal of the peptide, and the sequence of a selected protein or selectable marker with a desired 
biological function. The resultant fusion proteins contain the peptide fused to the selected protein or 
marker protein as described herein. Examples of proteins which may be used to prepare fusion 
proteins include immunoglobulins, grutathione-S-transferase (GST), hemagglutinin (HA), and 
truncated myc. 

Cyclic derivatives of the peptides of the invention are also part of the present invention. 
Cyclization may allow the peptide to assume a more favorable conformation for association with 
molecules in complexes of the invention. Cyclization may be achieved using techniques known in the 
art. For example, disulfide bonds may be formed between two appropriately spaced components 
having free sulfbydryl groups, or an amide bond may be formed between an amino group of one 
component and a carboxy] group of another component Cyclization may also be achieved using an 
azobenzene-containing amino acid as described by Wysse, L., et al., J. Am. Chem. Soc. 1995, 1 17, 
8466-8467. The side chains of Tyr and Asn may be linked to form cyclic peptides. The components 
that form the bonds may be side chains of amino acids, n on- amino acid components or a combination 
of the two. In an embodiment of the invention, cyclic peptides are contemplated mat have a beta-turn 
in the right position. Beta-turns may be introduced into the peptides of the invention by adding the 
amino acids Pro-Gly at the right position. 

It may be desirable to produce a cyclic peptide that is more flexible than the cyclic peptides 
containing peptide bond linkages as described above. A more flexible peptide may be prepared by 
introducing cysteines at the right and left position of the peptide and forming a disuhphide bridge 
between the two cysteines. The two cysteines are arranged so as not to deform the beta-sheet and 
turn. The peptide is more flexible as a result of the length of the disulfide linkage and the smaller 
number of hydrogen bonds in the beta-sheet portion. The relative flexibility of a cyclic peptide can be 
determined by molecular dynamics simulations. Peptide mimetics may be designed based on 
information obtained by systematic replacement of L-amino acids by D-amino acids, replacement of 
side chains with groups having different electronic properties, and by systematic replacement of 
peptide bonds with amide bond replacements. Local conformational constraints can also be 
introduced to determine conformational requirements for activity of a candidate peptide mimetic. The 
mimetics may include isosteric amide bonds, or D-amino acids to stabilize or promote reverse mm 
conformations and to help stabilize the molecule. Cyclic amino acid analogues may be used to 
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constrain amino acid residues to particular conformational states. The mrmettcs can also include 
mimics of inhibitor peptide secondary structures. These structures can model the 3 -dimensional 
orientation of amino acid residues into the known secondary conformations of (he proteins. Peptoids 
may abo be used which are oligomers of N-substituted amino acids and can be used as motifs for the 
5 generation of chemically diverse libraries of novel molecules. 

Peptides of the invention may be developed using a biological expression system. The use of 
these systems allows the production of large libraries of random peptide sequences and the screening 
of these libraries for peptide sequences that interact with particular amino acid residues. Libraries 
may be produced by cloning synthetic DNA that encodes random peptide sequences into appropriate 
10 expression vectors, (see Christian et al 1992, J. Mol. Biol. 227:711; Devlin et al 1990 Science 
249:404; Cwirla et al 1990, Proc. Natl. Acad, Sci. USA, 87:6378). Libraries may also be constructed 
by concurrent synthesis of overlapping peptides (see U.S. Pat No. 4,708,871). 
20 Peptides of the invention may be used to identify lead compounds for drug development. 

The structure of the peptides described herein can be readily determined by a number of methods 
15 such as NMR and X-ray crystallography. A comparison of the structures of peptides similar in 
sequence, but differing in the biological activities they elicit m target molecules can provide 
information about the structure-activity relationship of the target- Information obtained from the 
examination of structure-activity relationships can be used to design either modified peptides, or 
other small molecules or lead compounds which can be tested for predicted properties as related to 
20 the target molecule. The activity of the lead compounds can be evaluated using assays similar to 
those described herein. 

Information about structure-activity relationships may also be obtained from co- 
crystallization studies. In these studies, a peptide with a desired activity is crystallized in association 
with a target molecule i.e. SAM domain, and the X-ray structure of the complex b determined. The 
25 structure can then be compared to the structure of the target molecule in its native state, and 
35 information from such a comparison may be used to design compounds expected to possess desired 

activities. 

The peptides of the invention may be converted into pharmaceutical salts by reacting with 
inorganic acids such as hydrochloric acid, sulfuric acid, hydrobromk acid, phosphoric acid, etc, or • 
40 30 organic acids such as formic acid, acetic acid, propionic acid, glycolic acid, lactic acid, pyruvic acid, 

oxalic acid, succinic acid, malic acid, tartaric acid, citric acid, benzoic acid, salicylic acid, 
benezenesulfonk acid, and toluenesuifonic acids. The peptides of the invention may be used to 
prepare antibodies. Conventional methods can be used to prepare the antibodies. 

The peptides and antibodies specific for the peptides of the invention may be labelled using 

45 

35 conventional methods with various enzymes, fluorescent materials, luminescent materials and 
radioactive materials. Suitable enzymes, fluorescent materials, luminescent materials, and radioactive 
material are well known to the skilled artisan. Antibodies and labeled antibodies specific for the 
peptides of the invention may be used to screen for proteins containing SAM domains. 
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Computer modelling techniques known m the art may also be used to observe the interaction 
of a peptide of the invention, and truncations and analogs thereof with a SAM domain (for example, 
Homology Insight II and Discovery available from BioSym/Molecular Simulations, San Diego, 
California, U.S.A.). If computer modelling indicates a strong interaction, the peptide can be 
5 synthesized and tested for its ability to interfere with SAM domain dimer formation. 
Compositions and Methods of Treatment 

A purified three dimensional SAM domain structure of the invention, the peptides of the 
invention, and the modulators identified using the methods of the invention may be used to modify 
the inappropriate activity of a SAM domain involved in a clinical disorder. They may be used in the 
10 treatment and diagnosis of disorders associated with aberrant T cell signaling and to modulate 
telomere function. In particular, they may be useful in methods for therapy of cellular senescence and 
immortalization controlled by telomere length and telomeres e activity, and as selective 
20 immunosuppressants (e.g. in organ transplantation). They may also be useful in the treatment of 

cancers, such as melanoma, ocular melanoma, leukemia, astrocytoma, glioblastoma, lymphoma, 
15 glioma, Hodgkin's lymphoma, multiple myeloma, sarcoma, myosarcoma, cholangiocarcmoma, 
squamous cell carcinoma, CLL, and cancers of the pancreas, breast, brain, prostate, bladder, thyroid, 
ovary, uterus, testis, kidney, stomach, colon and rectum, particularly leukemia including B-ceJI 
leukemia, T-cell leukemia, null-cell leukemia, myelogenous leukemia, and lymphocytic leukemia, 

Further, the three dimensional SAM domain structure of the invention, the peptides of the 
20 invention, and the modulators identified using the methods of the invention may be used to modulate 
the biological activity of an Eph receptor or Eph ligand in a cell, including inhibiting or enhancing 
^® signal transduction activities of the receptor or ligand, and in particular modulating a pathway in a 

cell regulated by the ligand or. receptor, particularly those pathways involved in neuronal 
development, axonal migration, path finding and regeneration. The three dimensional SAM domain 
25 structure of the invention, the peptides of the invention, and modulators identified using the methods 
35 of invention will be useful as pharmaceuticals to modulate axonogenesis, nerve ccD interactions 

and regeneration, to treat conditions such as neurodegenerative diseases and conditions involving 
trauma and injury to the nervous system, for example Alzheimer's disease, Parkinson's disease, 
Huntington's disease, demyelinatmg diseases, such as multiple sclerosis, amyotrophic lateral 
40 30 sclerosis, bacterial and viral infections of the nervous system, deficiency diseases, such as Wernicke's 

disease and nutritional polyneuropathy, progressive supranuclear palsy, Shy Drager's syndrome, 
multistem degeneration and oirvo porno cerebellar atrophy, peripheral nerve damage, and trauma and 
ischemia resulting from stroke. 

The present invention thus provides a method for treating cancer (e.g. leukemia), and 
35 disorders associated with T cell signaling, modulating telomere function, or affecting neuronal 
development or regeneration, in a subject comprising administering to a subject an effective amount 
of a three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention. The invention also contemplates a method 
50 for stimulating or inhibiting axonogenesis in a subject comprising administering to a subject an 



55 



10 



25 



WOOQ/37500 PCT/CA99/01Z09 

-23- 

effective amount of a three dimensional SAM domain structure of the invention, a peptide of the 
invention, or a modulator identified using the methods of the invention. 

The invention still further relates to a pharmaceutical composition which comprises a 
purified three dimensional SAM domain structure of the invention, a peptide of the invention, or a 
modulator identified using the methods of the invention, and a pharmaceutical ry acceptable carrier, 
diluent or exciptenL The pharmaceutical compositions may be used to stimulate or inhibit neuronal 
development, regeneration and axona) migration associated with neurodegenerative conditions, and 
conditions involving trauma and injury to the nervous system. They may also be used to treat cancer 
15 and disorders associated with T cell signaling, and modulate telomere function. 

10 The compositions of the invention are administered to subjects in a biologically compatible 

form suitable for pharmaceutical administration in vivo. By "biologically compatible form suitable 
for administration in vivo*' is meant a form of the protein to be administered in which any toxic 
20 efTects are outweighed by the therapeutic effects of the protein. The term subject is intended to 

include mammals and includes humans, dogs, cats, mice, rats, and transgenic species thereof. 
1 5 Administration of a therapeutically active amount of the pharmaceutical compositions of the present 
invention is defined as an amount effective, at dosages and for periods of time necessary to achieve 
the desired result For example, a therapeutically active amount of a three dimensional SAM domain 
structure of the invention, peptides of the invention, or modulators of the invention may vary 
according to factors such as the condition, age, sex, and weight of the individual. Dosage regimes 
20 may be adjusted to provide the optimum therapeutic response. For example, several divided doses 
may be administered daily or the dose may be proportionaJry reduced as indicated by the exigencies 
30 of the therapeutic situation. 

The active compound may be administered in .a convenient manner such as by injection 
(subcutaneous, intravenous, etc.), oral administration, inhalation; transdermal application, or 
25 intracerebral administration. In particular embodiments, pharmaceutical compositions of the 
35 invention are administered directly to the peripheral or central nervous system, for example by 

administration intracerebralry. 

A pharmaceutical composition of the invention can be administered to a subject in an 
appropriate carrier or diluent, co-administered with enzyme inhibitors or in an appropriate carrier 
4Q 30 such as microporous or solid beads or liposomes. The term "pharmaceutkally acceptable carrier- as 

used herein is intended to include diluents such as saline and aqueous buffer solutions. Liposomes 
include water-in-oil-in-water emulsions as well as conventional liposomes (Strejan et al., (1984) ). 
Neuroimmunol 121). The active compound may also be administered parenteral ry or 
intraperitonealry. Dispersions can also be prepared in glycerol, liquid polyethylene glycols, and 
mixtures thereof and in oils. Under ordinary conditions of storage and use, these preparations may 
contain a preservative to prevent the growth of microorganisms. Depending on the route of 
administration, the active compound may be coated to protect the compound from the action of 
enzymes, acids and other natural conditions which may inactivate the compound. 
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Tbe pharmaceutical compositions may be administered locally to stimulate axonogenesis 
and pathfinding, Tor example the compositions may be administered in areas of local nerve injury or 
in areas where normal nerve pathway development has not occurred. The pharmaceutical 
compositions may also be placed in a specific orientation or alignment along a presumptive pathway 
5 lo stimulate axon pathfinding along that line, for example the pharmaceutical compositions may be 
incorporated on microcarriers laid down along the pathway. In particular, the pharmaceutical 
compositions of the invention may be used to stimulate formation of connections between areas of 
the brain, such as between the two hemispheres or between the thalamus and ventral midbrain. The 
pharmaceutical compositions may be used to stimulate formation of the medial tract of the anterior 
1 0 commissure or the habenular mterpeduncle. 

Therapeutic administration of polypeptides may also be accomplished using gene therapy, A 
nucleic acid including a promoter operativery linked to a heterologous polypeptide may be used to 
20 produce high-level expression of the polypeptide in cells transfected with the nucleic acid. DNA or 

isolated nucleic acids may be introduced into cells of a subject by conventional nucleic acid delivery 
15 systems. Suitable delivery systems include liposomes, naked DNA, and receptor- mediated delivery 
systems, and viral vectors such as retroviruses, herpes viruses, and adenoviruses. 

The following non-limiting example is illustrative of the present invention: 
EXAMPLE 

The following methods were used to determine the crystal structure of the SAM domain of 
20 the Eph receptor isoform A4. 

Protein expression, mutagenesis and purification: The SAM domain of the Eph receptor isoform 
A4 (residues 890 to 981 ) was expressed in E. coli as a GST fusion protein using the pGEX-2T vector 
(Pharmacia). The Quickchange kit (Stratagene) was used to generate site directed mutants for 
dimerization analysis and for heavy atom phasing. Protein was purified by affinity chromatography 
25 using glutathione Sepharose beads (Pharmacia). Bound protein was ehrted by cleavage with 
35 thrombin. After concentrating to 10 mM, protein was applied to a Superdex 75 gel filtration column 

(Pharmacia) for final purification and characterization. 

Crystallization and data collection: Hanging drops containing 1 pi of 100 mg/ml native or mutant 
(Glu 941 Cs) protein in 7m M Hepes pH 73 were mixed with equal volumes of reservoir buffer 
4Q 30 containing 1 00 mM cacodylate pH 6.5, 7% (w/v) PEG 8000, and 20% (v/v) ethylene glycol. Rod like 

crystals of approximate dimensions 0.05 x 0.05 x 0.2 mm were obtained overnight. The crystals 
contain one molecule of the EphA4 SAM domain per asymmetric unit, and belong to the space group 
P6 4 , (a = b = 77.14 A, c ■ 2437 A). The solution dimer corresponds to a crystaHographic dimer 
generated from the asymetric unh by a two fold rotation parallel to the unique crystal axis. Crystals 

45 

35 were cryo-protected in reservoir buffer enriched to 20% (w/v) PEG 8000 and 20% (v/v) ethylene 
glycol prior to stream freezing. Heavy atom derivatives were prepared by soaking crystals overnight 
in 1-10 mM heavy atom solution prepared in cryo-protection buffer. 
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Nativc and derivative diffraction data were collected on frozen crystals (108°K) using a 
Marr 345 imaging plate detector system with an RU200 rotating anode generator (Table I). Data 
processing and reduction was carried out whb the HKL, DENZO, and SCALEPACK programs. 

Single isoroorphous replacement (SIR) protein phases were calculated using lead derivative 
5 data collected on two separate protein crystals. The heavy atom site was identified by the Patterson 
search program HASSP [Terwilliger, 1987). A Glu 941 to Cys site directed mutant of the EphA4 
SAM domain construct was employed for mercury derivatization. The heavy atom position of the 
mercury derivative data, which was collected on three separate crystals, was identified by difference 
Fourier synthesis. Multiple isomorphous replacement and anomalous scattering (M1RAS) phases, 
10 using only the lead derivative anomalous signal, were calculated and iterative rounds of automatic 
solvent boundary determination/density modification were performed using the PHASES package 
|Furey, 1990]. The resultant experimental electron density map allowed for the complete tracing of 
20 the SAM domain backbone structure. 

Model building and Refinement: Model building was performed using O (Jones, 1991 J. A starting 
15 model comprising approximately 65% of the total structure was refined using XPLOR [Brunger, 
1992]. Bulk solvent correction was applied during refinement and simulated annealing protocols were 
employed. The remaining structure was built into 2F„-F e electron density maps generated with 
XPLOR. The final refinement statistics are shown m Table I. The first 20 residues of the SAM 
domain construct are disordered (residues 890 to 909) and have not been modeled No amino acid 
20 residues occupy disallowed regions of the Ramachandran plot and 94 % occupy the most favored 
regions. 
- Results: 

The X-ray crystal structure of the SAM domain from the EphA4 receptor tyrosine kinase 
(Table 1 and 2) was determined. The boundaries of the structure were defined by limited proteolysis 
25 and mass- spectrometry. Overall, the structure of the homodimer is oblong and arises from the 
35 association of two 'lobster claw' shaped subunhs. Each subunit possesses a globular fold consisting 

of an N-terminal extended strand segment, followed by four short a helices (al to a4) and one long 
C- terminal helix a 5 (Figure 2A, 2B, and 2C). The N- and the C- termini are located on one side of 
the subunit fold, similar to other protein interaction modules with signaling function (SH3, SH2, PH 
40 30 domains etc.) (Kuriyan, 1997]. However, in contrast to these other domains, the termini compose the 

functional end of the molecule rather than lying opposite to the ligand-binding surface. As shown in 
Figure 3A and 3B, the N-terminal strand region and the C-terminaJ helix ct5 extend from the subunit 
core and tnterdighate in a pincer like manner with the termini of a second subunit, to form an 
4 5 elaborate dhner interface. In addition to the N- and C-terminal regions, a-helices al and a3 

35 contribute side chains to the dimer interface. 

The N-terminal strands cross in an anti-parallel manner and project the side chains of Ala 
912, Val 913, Val 914 and Phe 910 downward to form one mandible of the Mobster ctaw* shaped 
subunit. The C-terminal helices a5 also cross in an anti-parallel manner with each a-helix projecting 
the side chains of Met 972, Met 976, and Met 979, upwards to form the second mandible. Together 
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thcsc side chains compose a hydrophobic core lhat is fully continuous with those of the individual 
summits. Residues bridging the subunil and interface cores include Trp 919, Ala 922 and He 923 from 
helix a 1 and Leu 940 and Va) 944 from helix a3. Complementing these hydrophobic interactions, 
the conserved side chain of arginine 973 forms intermolecnlar electrostatic interactions with the free 
5 carboxylate of glycine 98) and a stabilizing charge/helix dipole interaction with the C-terminus of 
helix ct5 (Figure 2C). Additional polar residues located at or in close proximity to the dimer interface 
include His 980, Gm 975, His 945, Gm 977, Ghi 941 and Ser 91 1. 

In order to identify determinants of dhnertzation and to test that the crystallographfc dimer 
model reflects the solution structure of the EphA4 SAM domain, SAM domain residues, cither singly 
10 or in combination, were substituted and the behaviour of these mutants was tested using size 
exclusion chromatography (Figure 4). In agreement with predictions from the crystal structure, 
mutations involving the interface residues Val 913, Val 914, Met 972, Met 976, Met 979, Val 944, 
20 and Leu 940 abolished dimer formation. In contrast, mutation of Val 969 to Ala, which comprises 

part of the second hydrophobic surface region (Figure 3 A and 3B), did not affect dimerization while 
15 mutation of the proximal residue He 959 to Lys, appeared to disrupt the integrity of the subunh fold. 
Additionally, mutation of the surface exposed residues Glu 941, Asp 949, and Ser 968 to cysteine, 
did not disrupt SAM domain dimerization. In summary, the mutagenesis results are consistent with 
and support the notion that the SAM domain dimer observed in the crystal structure represents a 
mechanism through which the SAM domain associates in solution. 
20 To investigate whether the dimer model for the Eph receptor SAM domain has more general 

relevance for SAM domain containing proteins, the predicted locations of residues that are required 
for the dimerization of SAM domains on other polypeptides were examined. When mutations that 
map to conserved features of the subunh core and therefore are likely to disrupt the summit fold are 
eliminated, a number of informative mutations stand out. For example, the homo- and betero-typic 
25 dimerization of the Poly comb family of transcriptional repressors ph, RAE28 and Scm, is abolished 
35 by mutation of two residues predicted to map to the dimer interface [Kyba, 1998). These residues. He 

62 and Trp 1 of the ph SAM domain, correspond to the N-terminal strand residue Phe 910 and the a5 
helix residue Met 972, respectively, of the EphA4 SAM domain. Both residues are highly conserved 
amongst the SAM domains and yet are unlikely to affect the individual subunit fold. The mutation of 
40 30 ^ latter residue (Met 972 to Lys) in the EphA4 SAM domain yields a compact monomer structure 

(Figure 4). In addition, the hetero-dimerization of the SAM domain containing proteins Byr2p and 
Ste4p is disrupted by the substitution of Arg 69 with cysteine[Tu, 1 997 #25). This mutation maps to 
the interface residue Gm 977 of the EphA4 SAM dimer, and b located at the crossing site of the two 
45 a5 helices. Taken together, these observations indicate mat the dimer structure of the EphA4 SAM 

35 domain may reflect a more general mode of SAM domain dimerization. 

The crystallographic model for SAM domain dimerization is attractive for a number of 
reasons. Firstly, in the case of the Eph receptors, the linkers between the SAM and the catalytic 
domains is short (5 residues of poorly conserved sequence) so mat the N-termini of the dimer would 
have to be oriented in the same direction and in close proximity if the kinase domains of clustered 
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receptors were to be juxtaposed. The structure shows this to be the case. Secondly, the mechanism 
of dJmerization revealed by the structure could account for the observation that the SAM domain is 
found at either terminus of signaling proteins. Because the N- and C-terminal ends of the SAM 
domain compose the dimer interface, the insertion of a SAM domain at an internal site in a 
5 polypeptide chain would sterically restrict access to a second SAM domain, especially if the host 
sequence was itself structured. The solutions to this dilemma would be to place a SAM domain at the 
end of a protein (as is usually observed), or to surround it with long linker sequences. In this regard 
the SAM domain differs from modules such as SH2 and SH3 domains, which can readily be located 
15 at internal positions in a polypeptide chain since the ligand-binding site is located opposite to the 

10 location of the N- and C-termini fKuriyan, 1997]. Thirdly, in the case of the Kprins we have noted 
three adjacent SAM domains in a region previously shown to mediate liprin hetero-dimerization 
[Serra-Pages, 1998}. Because the C-termini of the dimerized SAM domain are in close proximity, on 
2Q the opposite side from the N-termini, a configuration of stacked SAM domains can be readily 

envisioned. 

15 SAM dimerizarion may contribute to receptor ottgomerization and activation by bringing 

catalytic elements into proximity for autophospborylation. The SAM domain may have a direct 
inhibitory interaction with the kinase domain that can be competed away by dimerizatioo. 
Alternatively SAM domain mediated dimerization might maintain opposing catalytic domains in a 
mutually inaccessible, and thus repressed state. The Eph SAM domains might also recruit signaling 
20 partners through heteromeric SAM-SAM interactions, or through specific recognition of cytoplasmic 
proteins by the Eph SAM dimer. 

SAM dimerization might be constitutive, but controlled through co-operative or antagonistic 
interactions with other clustering forces. Dimerization could potentially be controlled by 
modifications such as tyrosine phosphorylation, and indeed a residue within the SAM domain of the 
25 EphBl receptor can become tyrosine phosphorylated in vivo 1 Stein, 1996]. Finally, the five residues 
35 that lie C-terminal to the Eph SAM domain represent a potential binding site for PDZ domain 

proteinsfHock, I998],which might influence the organization of the SAM domain. 

The structure of the EphA4 domain reveals a novel mechanism through which modular 
domains control protein-protein interactions. Since SAM domains are found in cell surface receptors, 
30 cytoplasmic signaling proteins, and transcriptional activators and repressors, as well as chimeric 
human oncoproteins, these results have general implications for understanding the formation of 
complexes involved in norma) and oncogenic signal transduction. 

Having illustrated and described the principles of the invention in a preferred embodiment, h 
should be appreciated to those skilled in the art that the invention can be modified in arrangement and 
45 35 detail without departure from such principles. All modifications coming within the scope of the 

following claims are claimed. 

All publications, patents and patent applications referred to herein are incorporated by 
reference in their entirety to the same extent as if each individual publication, patent or patent 
50 application was specifically and individually indicated to be incorporated by reference in its entirety. 
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Dctailed Description of the Drawings 

Figure 1 A shows a sequence alignment of SAM domains from selected proteins. Secondary 
structure is indicated for the SAM domain from the EphA4 receptor tyrosine kinase. Residue 
numbers for the start of each SAM domain are shown on the left and Genebank accession numbers on 
the right. Conserved hydrophobic residues are colored green, acidic residues red, basic residues 
blue, polar residues orange and glycines are colored pink. Residues at the dimer interface shown in 
Figure 2C are indicated (•). Liprtn a I contains 3 SAM domains designated SI, S2 and S3. 

Figure IB shows a selection of multi-domain proteins containing SAM domain (S) h shown. 
Domains listed include, tyrosine or serine/threonine kinase catalytic domains, myosm-like domain, F- 
J0 actin binding domain (F-actin BD), PDZ domain, SH2 domain, inositol phosphatase catalytic domain 
(inositol p' tase), GTPase activating domain (GAP), DNA-btnding domain (DNA-BD) and a 
transmembrane region (TM). 

20 Figure 2A, 2B, and 2C Ribbons depiction of the SAM borno-dimer viewed (Figure 2A) 

down the twofold symmetry axis and (Figure 2B) perpendicular to the symmetry axis. The dimer 
15 subunits arc coloured red and blue and a-helices are labeled. (Figure 2C) Ribbons stereo view 
highlighting the dimer interface region. Aromatic, aliphatic, methionine, histidme and arginine 
interacting side chains are coloured light brae, green, yellow, orange, and blue (see Figure 1A for 
residue identification). All ribbon diagrams were generated using RIBBONS [Carson, 1991 J. 

Figure 3 A, B. Molecular surface and worm representations of the SAM homodimer. The 
molecular surface of one subunit is shown with hydrophobic (Met, Val, Leu, lie, Phe,X bask (Arg, 
Lys) and acidic (Glu, Asp) side chains coloured green, blue and red, respectively. The two 
perspectives differ by a 90° rotation about the vertical axis. In Figure 3B the twofold rotation axis 
relating the two subunits of the dimer is shown. The buried surface area of the dimer interface is 
1 923 A. All molecular surfaces were generated using GRASP |NichoBs, 1991 1 
25 Figure 4. Gel filtration ehrtion profile of wild type and single or double she mutants of the 

35 EpbA4 receptor SAM domain. Chromatograms correspond to the loading of equivalent 

concentrations (10 mM) and total volumes ( 100 pi) of protein on a Superdex-75 gel filtration column 
(24 ml bed volume). The column was calibrated using Pharmacia low molecular weight standards. 
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5 

WE CLAIM: 

I. A purified three dimensional structure of a polypeptide corresponding to one or more SAM 
domains. 

10 

5 2. A three dimensional structure as claimed in claim I , wherein the SAM domain is a SAM domain 
of an Eph receptor. 

3. A three dimensional structure as claimed in claim 2 wherein the Eph receptor is EphA. 

4. A three dimensional structure as claimed in claim 1 complex ed with one or more compounds. 
15 5 - A wree dimensional structure as claimed in claim I comprising one or more heavy metal atoms. 

10 6. A purified crystalline form of a polypeptide corresponding to one or more SAM domains. 

7. A crystalline form as claimed in claim 6 having dimensions of about a=b= 77.14 ± .03 
angstroms, c= 24.3 * .04 angstroms. 
20 & A crystalline form as claimed in claim 7 having the co-ordinates set out in Table 2. 

9. A method of forming a crystalline form as claimed in claim 6 comprising 
15 (a) mixing a volume of a SAM domain with a reservoir solution; and 

(b) incubating the mixture obtained in step (a) over the reservoir solution in a closed container 
under conditions suitable for crystallization. 

10. A method of determining three dimensional structures of polypeptides with SAM domains of 
unknown structure comprising the step of applying the structural atomic coordinates of a three 
dimensional structure as claimed in claim 1 or a crystalline form as claimed in claim 7 or 8. 

1 1. A method for identifying a potential modulator of a SAM domain of an Eph receptor function 
comprising docking a computer representation of a structure of a compound with a computer 
representation of a structure of one or more SAM domains of an Eph receptor that is defined by 
the atomic structural coordinates of the three dimensional structure as claimed in claim 2 or a 

25 crystalline form as claimed in claim 7 or 8. 

35 12. A method as claimed in claim 1 1 comprising the following steps: 

(a) docking a computer representation of a compound from a computer data base with a 
computer representation of a selected she on a three dimensional structure of a SAM domain 
of an Eph receptor as claimed in claim 2 or a crystalline form as claimed in claim 7 or 8 to 

40 30 obtain a complex; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying compounds that best fit the selected site as potential modulators of SAM domain 
function. 

13. A method as claimed in claim 1 1, comprising the following steps: 

(a) modifying a computer irpresentation of a compound complexed with a selected she on 
a three dimensional structure of a SAM domain of an Eph receptor as claimed in claim 
2 or a crystalline form as claimed in claim 7 or 5, by deleting or adding a chemical 
50 group or groups; 
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(b) determining a conformation of the complex with a favourable geometric fit and 
favourable complementary interactions; end 

(c) identifying a compound that best fits the selected site as a potential modulator of a SAM 
domain. 

14. A method as claimed in claim 1 1 comprising the following steos: 

(a) selecting a computer representation of a compound complexed with a selected site on a three 
dimensional structure of a SAM domain of an Eph receptor as claimed in claim 2 or a 
crystalline form as claimed m claim 7 or 8; and 

(b) searching for molecules in a data base that are similar to the compound using a searching 
computer program, or replacing portions of the compound with similar chemical structures 
from a data base using a compound building computer program. 

1 5. A potential modulator of a function of a SAM domain of an Eph receptor identified by a method 
as claimed in any one of claims 1 1 to 14. 

16. A method of treating a disease associated with a SAM domain of an Eph receptor with 
inappropriate activity in a cellular organism, comprising: 

(a) administering a crystalline form of a polypeptide as claimed in claim 6 or a modulator 
identified using a method as claimed in any one of claims 1! to 14, in an acceptable 
pharmaceutical preparation; and 

(b) activating or inhibiting a SAM domain function to treat the disease. 

17. A method as claimed in claim 16 wherein the disease is a cell proliferative disease or disease 
associated with the nervous system. 

1 8. A peptide of the formula 1 which mediates SAM domain function: 

wherein X and X 6 represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 
amino acids, and X* represents Leu, Phe, Asp, Ala, Glu, or Gly, preferably Leu or Gly, X* 
represents Glu, Asp, Ser, He, Ala, Arg, Lys, and Gin, preferably Glu or Asp, X 3 represents Ala, 
Val, Glu, Phe, Ser, lie, Met, Leu, His, Gin, Arg, or Asp preferably Ala, Vai or Phe, X 4 is Val, 
Leu, Met, Phe, and He, preferably Val or Leu, or Phe, X 5 is Val, Ser, Leu, Asp, Ala, Pro, Asn, 
Lys, or Cys, preferably Val or Ser. 

19. A peptide as claimed in claim 18 wherein X represents TT, ID, TS, DD, GYTT (SEQ ID. NO. 
38), AAGYTT (SEQ ID. NO. 39), FTAAGYTT (SEQ ID. NO. 40), DNFTAAGYTT (SEQ ID. 
NO. 41), or YKDNFTAAGYTT (SEQ ID. NO. 42). 

20. A peptide as claimed in claim 18 wherein X* represents HM, HMSQ (SEQ ID. NO. 43), 
HMSQD (SEQ ID. NO. 44), HMSQDD (SEQ ID. NO. 45), HMSQDDLA (SEQ ID. NO. 46X 
QMMM (SEQ ID. NO. 47), QMMMED (SEQ tt>. NO. 48), QMMMEDLL (SEQ ID. NO. 49), 
DITE (SEQ ID. NO. 50), D1TEED (SEQ ID. NO. 51), DITEEDL (SEQ ID. NO. 52), NLTE 
(SEQ ID. NO. 53), NLTEND (SEQ ID. NO. 54), or NLTENDI (SEQ ID. NO. 55). 
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21. A peptide of the formula I as claimed in claim 18 which is LEAW (SJEQ ID. NO. 56), 
TTLEAW (SEQ ID. NO. 57), LEAWHM (SEQ ID. NO. 58), LEA WHMSQ (SEQ ID. NO. 
59), LEAVVHMSQD (SEQ ID. NO. 60), LEAWHMSQDDL (SEQ ID. NO. 61), 
LEAWHMSQDDLAR (SEQ ID. NO. 62), TTLEAWHMS (SEQ ID. NO. 63X 
TTLEAWHMSQD (SEQ ID. NO. 64X TTLEAWHMSQDDL (SEQ ID. NO. 65\ 
TTLEAVVHMSQDDLAR (SEQ ID. NO. 66). GYTTLEAW (SEQ ID. NO. 67), 
GYTTLEAVVHMS (SEQ ID. NO. 68), GYTTLEAWHMSQD (SEQ ID. NO. 69X 
GYTTLEAWHMSQDDL (SEQ ID. NO. 70), GYTTLEAVVHMSQDDLAR (SEQ ID. NO. 
71), FDVVS (SEQ ID. NO. 72X FDVVSQ (SEQ ID. NO. 73X FDWSQMM (SEQ ID. NO. 74), 
FDVVSQMMME (SEQ ID. NO. 75), FDWSQMMMEDIL (SEQ ID. NO. 76), TSFDWS 
(SEQ ID. NO. 77), TSFDVVSQ (SEQ ID. NO. 78), TSFDWSQMM (SEQ ID. NO. 79X 
TSFDVVSQMMME (SEQ ID. NO. 80), TSFDV VSQM MMEDIL (SEQ ID. NO. 81), LEFLS 
(SEQ ID. NO. 82), LEFLSD (SEQ ID. NO. 83X LEFLSD1T (SEQ ID. NO. 84X LEFLSD1TEE 
(SEQ ID. NO. 85), LEFLSDITEEDL (SEQ ID. NO. 86X DDLEFLS (SEQ ID. NO. 87X 
GWDDLEFLS (SEQ ID. NO. 88X DDLEFLSD (SEQ ID. NO. 89), DDLEFLS DU (SEQ ID. 
NO. 90X DDLEFLSD1TFJE (SEQ ID. NO. 91X DDLEFLSDITEEDL (SEQ ID. NO. 92), 
GARFL (SEQ ID. NO. 93), GARFLN (SEQ ID. NO. 94), GARFLNLT (SEQ ID. NO. 95X 
GARFLNLTEN (SEQ ID. NO. 96), and IDGARFL (SEQ ID. NO. 97). 

22. A peptide of the formula IJ which mediates SAM domain function: 

x 7 -x J, -x^x ,0 -x n -x ,2 -x I3 -x ,4 -x ,5 -x ,6 h 

wherein X 7 and X ,fc represent 0 to 70, preferably 0 to 50 amino acids, more preferably 2 to 20 
amino acids, and X 1 represents Met, lie, Ser, Leu, Asn, Pbe, or Val, preferably Met, X 9 represents 
Arg, Ser, Lys, Met, Leu, Glu, Gin, or Asn, preferably Cm or Arg, X 10 represents Thr, Ala, Arg, 
Leu, Ser, Glu, Asp, Met, Lys, Gin, or Gly, preferably Thr, Ala, or Glu, X" represents Gm, Ser, 
Glu, Leu, Phe, Asp, Thr, Arg, preferably Gin or Arg, X 12 represents Met, Ala, lie, Asn, Ser, Arg, 
Thr, Pro, Leu, Gin, VaL Lys, preferably Met or Arg, X* 3 represents Gm, Asn. Pro, Ser, Tyr, Gm, 
Leu, Arg, or Lys, preferably Gin, Asn, or Arg, X 14 represents Gin, Ala. Pro, Asp, Leu, Lys, He, 
Glu, Arg, or Asn, preferably Gin or He, and X 15 represents Met, He, VaL His, Ser, Arg, Lys, Phe, 
Cys, Gm, Tyr, Ala. He, Trp, or Leu. 

23. A peptide of me formula II as claimed in claim 22 wherein X 7 represents QA, QV, NK, SVQA 
(SEQ ID. NO. 98X LSSVQA (SEQ ID. NO. 99X ILSSVQA (SEQ ID. NO. 100), NKILSSVQA 
(SEQ ID. NO. 101X HONK ILSSVQA (SEQ ID. NO. 102), THQNKILSSVQA (SEQ ID. NO. 
103X ENIK (SEQ ID. NO. 104), SQEINK (SEQ ID. NO. 105X KLSQEINK (SEQ ID. NO. 
106), ILNS1QV (SEQ ID. NO. I07X or NSIQV (SEQ ID. NO. 108). 

24. A peptide of the formula II as claimed in claim 22 wherein X 1 * is HG, QS, HGRM (SEQ ID. NO. 
109), HGRMVP (SEQ ID. NO. 1 10), QSVEV (SEQ ID. NO. Ill), or TRKP (SEQ ID. NO. 1 12). 
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25. A peptide of the formula 11 as claimed in claim 22 which is MRTQMQQM (SEQ ID. NO. 1 13X 
QAMRTQMQQM (SEQ ID. NO. I14X SVQAMRTQMOQM (SEQ ID. NO. 1I5X 
LSSVQAMRTQMQQM (SEQ ID. NO. 116), ILSSVQAMRTQMQQM (SEQ ID. NO. 117), 
MRTQMQQMHG (SEQ ID. NO. 118), MRTQMQQMHGRM (SEQ ID. NO. 119X 
MRTQMQQMHGRMVPV (SEQ ID. NO. 120), NEERRSIF (SEQ ID. NO. I2IX 
INKNEERRSIF (SEQ ID. NO. 122), NEERRSIFTRXP (SEQ ID. NO. 123). MRAQMNQI 
(SEQ ID. NO. I24X MRAQMNQIQS (SEQ ID NO. 125), MRAQMNQIQSVEV (SEQ ID. NO. 
126). 

26. A peptide which mediales SAM domain function comprising WSV (SEQ ID. NO. 2 IX 
SAWSV (SEQ ID. N0.22X FSAVV (SEQ ID. N0.23 ), FSAVVSV (SEQ ID. NO. 24X 
FSAWSVGD (SEQ ID. NO. 25), WSVGDWL (SEQ ID. NO. 26), FNTV (SEQ ID. NO. 27), 
FNTVDE (SEQ ID. NO. 28X FNTVDEWL (SEQ ID. NO. 29), TSFNTVDEWL (SEQ ID. NO. 
30), TSFNTV (SEQ ID. NO. 31), YTSFNTV (SEQ ID. NO. 32), RSEV (SEQ ID. NO. 33), 
RSEVLG (SEQ ID. NO. 34X RSEVLGVD (SEQ ID. NO. 35X VPFRSEV (SEQ ID. NO. 36), 
and VPFRSEVLGW (SEQ ID. NO. 37). 

27. A pharmaceutical composition comprising a peptide as claimed in any one of claims 1 8 to 26 and 
a pharmaceuticalry acceptable carrier, diluent or excipient 
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SEQ. ID. NO. 1 
EPHA4 

PEFSAVVSVGDWLQAiKMDRYKDNFTAAGY 
AMRTQMQQMHGRMVPV 

SEQ. ID. NO. 2 

EPHB2 

PDYTSFNTVDEWUL\IKMGQYKi£FANAGFrSFDW 
QVMRAQMNQIQSVEV 

SEQ. ID. NO. 3 

DGK-dc!ta 

VHLVGTEE^AAWLElILSlXEYKDIFTRHDIRGSELLFIUiRR^ 
LSRSAPAVEA 

SEQ. ID. NO. 4 

SHIP2 

^GEAGMSAWUIAIGLERYEEGLVHNGWDDLEH^DIT^ 

SEQ. ID. NO. 5 
RhoGAP p!22 

LTQIEAKEACDWIJ*ATGFPQYAQLYEDFLFPIDISLVKREHDF^ 
MKLE1SPHRKRS 

SEQ. ID. NO. 6 

Liprin al-Sl 

QWIKSPTVVVWI^WVGWAWYVAACRAW^ 
QE1MSLTSPSAPPT 

SEQ. ID. NO. 7 

Liprin a 1-S2 

NIIEWIGNEWLPSLGLPQYRSYFMECLVDA^ 
LRRLNYDRKELE 

SEQ. ID. NO. 8 

Liprin aI-S3 

VLVWSNDRVIRWILSIGLKYANNUESGVHGALLALDETFDFSAIJUXL^ 
EFNNLLVMGT 
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SEQ. ID. NO. 9 
Cortactin-BPl 

VHLWTKPDVADWLESU^EHKETFMDNE^ 
QLLDR 

SEQ. ID. NO. 10 
Neurabin 

VHEWS V OO VS H WL V GI^SLDQ Y VSEFS AQNTSGEQLLQLDGN KLKALGMTSSQDRAL VK K KL 
KEMKMSLEKARKAQ 

SEQ. ID. NO. II 

SLP-76 

RNVPFRSEVLGW D PDSLAD YPKKLNYDCEK A VKK YHIDG AJOlJ^TLTENDI QKl^KLR VPDLS K 
LSQEINKNEERRSEFTRKP 

SEQ. ID. NO- 12 

Byi2p (S.pombe) 

ME Y YTSKEVAEWLKSIGLEK YIEQFSQNWIEXjRHU^HLTLI^KDLGIENT AKGKQHJCQD YL 
REFPRPCILRF 

SEQ. ID. NO. 13 

Ste4(S.pombe) 

YWNWNNEAVCNWIEQLGFPHKEAFFJ>YHILG 
KKQKDKL.QQE 

SEQ. ID. NO. 14 

Stcll (S.cercvisiac) 

RDJOUE^^ L ^^ 
SEQ. ID. NO. 15 
STE50 (S. ccrevisiac) 

FSQWSVDDVrmaSTIXVEETDPLCQRlJlEKDIVGDIXPELCLQDLCDGDI^ 
MRDSKLEWKDDK 

SEQ. ID. NO. 16 

ETS-J 

PRQWTETHVRDWVMWAVNEFSUCGVDFQKIO^ 
LEILQKEDVKPYQVNG 

SEQ. ID. NO. 17 

FLM 
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PTLWTQEHVRQWI^WAIKEYSUrfEIDTS 
SYLRESSLLAYNTT 

SEQ. ID. NO. J 8 

TEL 

PmVSRDDVAQWlJCWA£NEESIJ^n)SNTFE\iNGKAUXLTKEDFR 
LKQRKPR1LFSP 

SEQ. ID. NO. 19 

RAE28 

PSQWSWEVYEFIASLOGCQEIAEEFRSQEIDGQAJLLIXKEEHLMSAMNIKLGPA^ 
KET 

SEQ. ID. NO. 20 
Sera 

PIDWTlFJEVIQYIESNDNSLAVHGDLFRKJiEIDGKAlJJXNSEMMMKYMGLKLGPALKICNLV 
NKVNGRRNNLAL 

SEQ. ID. NO. 21 

VVSV 

SEQ ID. NO. 22 
SAWSV 
SEQ ID. NO.23 
FSAVV 
SEQID.NO.24 
FSAWSV 
SEQ ID. NO. 25 
FSAVVSVGD 
SEQ ID. NO. 26 
VVSVGDWL 
SEQ ID. NO. 27 
FNTV 

SEQ ID. NO. 28 
FNTVDE 
SEQ ID. NO. 29 
FNTVDE WL 
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SEQ ID. NO. 30 

TSFNTVDEWL 

SEQ ID. NO. 31 

TSFNTV 

SEQ ID. NO. 32 

YTSFNTV 

SEQ ID. NO. 33 

RSEV 

SEQ ID. NO. 34 
RSEVLG 
SEQ ID. NO. 35 
RSEVLGWD 
SEQ ID. NO. 36 
VPFRSEV 
SEQ ID. NO. 37 
VPFRSEVLGW 
SEQ ID. NO. 38 
GYTT 

SEQ ID. NO. 39 
AAGYTT 
SEQ ID. NO. 40 
FT AAGYTT 
SEQ ID. NO. 4 I 
DNFT AAGYTT 
SEQ ID. NO. 42 
YKDNFT AAGYTT 
SEQ ID. NO. 43 
HMSQ 

SEQ ID. NO. 44 
HMSQD 
SEQ ID. NO. 45 
HMSQDD 
SEQ ID. NO. 46 
HMSQDDLA 
SEQ ID. NO. 47 
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QMMM 
SEQID.NO. 48 
QMMMED 
SEQID.NO. 49 
QMMMEDLL 
SEQ ID. NO. 50 
DUE 

SEQ ID. NO. 51 

DITEED 

SEQ ID. NO. 52 

DITEEDL 

SEQ ID. NO. 53 

NLTE 

SEQ ID. NO. 54 
NLTEND 
SEQ ID. NO. 55 
NLTENDI 
SEQ ID. NO. 56 
LEAVV 
SEQ ID. NO. 57 
TTLEAW 
SEQID.NO. 58 
LEAWHM 
SEQ ID. NO. 59 
LEAVVHMSQ 
SEQ ID. NO. 60 
LEAVVHMSQD 
SEQ ID. NO. 61 
LEAWHMSQDDL 
SEQ ID. NO. 62 
LEAWHMSQDDLAR 
SEQ ID. NO. 63 
TTLEAWHMS 
SEQ ID. NO. 64 
TTLEAVVHMSQD 
SEQ ID. NO. 65 
TTLEAVVHMSQDDL 
SEQ ID. NO. 66 
TTLEAVVI1MSQDDLAR 
SEQ ID. NO. 67 
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GYTTLEAW 
SEQ ID. NO. 68 
GYTTLEAWHMS 
SEQ ID. NO. 69 
GYTTLEAVVHMSQD 
SEQ ID. NO. 70 
GYTTLEAWHMSQDDL 
SEQ ID. NO. 71 
GYTTLEAWHMS QDDLAR 
SEQ ID. NO. 72 
FDVVS 

SEQ ID. NO. 73 
FDWSQ 
SEQ ID. NO. 74 
FDVVSQMM 
SEQ ID. NO. 75 
FDWSQMMME 
SEQ ID. NO. 76 
FDWSQMMMEDIL 
SEQ ID. NO. 77 
TSFDWS 
SEQ ID. NO. 78 
TSFDVVSQ 
SEQ ID. NO. 79 
TSFDVVSQMM 
SEQ ID. NO. 80 
TSFDVVSQMM ME 
SEQ ID. NO. 81 
TSFDWSQMMMEDIL 
SEQ ID. NO. 82 
LEFLS 

SEQ ID. NO. 83 
LEFLSD 
SEQ JD. NO. 84 
LEFLSDIT 
SEQ ID. NO. 85 
LEFLSDITEE 
SEQ ID. NO. 86 
LEFLSD ITEEDL 
SEQ ID. NO. 87 
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DDLEFLS 
SEQ ID. NO. 88 
GWDDLEFLS 
SEQ ID. NO. 89 
DDLEFLSD 
SEQ ID. NO. 90 
DDLEFLSDIT 
SEQ ID. NO. 91 
DDLEFLS DITEE 
SEQ ID. NO. 92 
DDLEFLSDITEEDL 
SEQ ID. NO. 93 
GARFL 

SEQ ID. NO. 94 
GARFLN 
SEQ ID. NO. 95 
GARFLNLT 
SEQ ID. NO. 96 
GARFLNLTEN 
SEQ ID. NO. 97 
1DGARFL 
SEQ ID. NO. 98 
SVQA 

SEQ ID. NO. 99 
LSSVQA 
SEQ ID. NO. 100 
ILSSVQA 
SEQ ID. NO. 101 
NK1LSSVQA 
SEQ ID. NO. 102 
HQNKILSSVQA 
SEQ ID. NO. 103 
THQN KILSS VQ A 
SEQ ID. NO. 104 
ENIK 

SEQ ID. NO. 105 
SQE1NK 
SEQ ID. NO. 106 
KLSQEINK 
SEQ ID. NO. 107 
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ILNSIQV 
SEQID.NO. 10S 
NSIQV 

SEQID.NO. 109 
HGRM 

SEQID. NO. 110 
HGRMVP 
SEQID.NO. Ill 
QSVEV 

SEQID. NO. 112 
TRKP 

SEQID. NO. 113 
MRTQMQQM 
SEQID. NO. 114 
QAMRTQMQQM 
SEQID. NO. 115 
SVQAMRTQMQQM 
SEQ ID. NO. 116 
LSSV QAMRTQMQQM 
SEQ ID. NO. 1 17 
ILSSV QAMRTQMQQM 
SEQID.NO. 118 
MRTQMQQMHG 
SEQID. NO. 119 
MRTQMQQMHGRM 
SEQID. NO. 120 
MRTQMQQMHGRMVPV 
SEQID. NO. 121 
NEERRSIF 
SEQID. NO. 122 
INKNEERRSIF 
SEQID.NO. 123 
NEERRSIFTRKP 
SEQ ID. NO. 124 
MRAQMNQI 
SEQ ID. NO. 125 
MRAQMNQ1QS 
SEQ ID. NO. 126 
MRAQMNQIQSVEV 
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